Nucleic Acids and Proteins Associated with Sucrose Degradation in Coffee

ABSTRACT

Disclosed herein are nucleic acid molecules isolated from coffee ( Coffea  spp.) comprising sequences that encodes various sucrose metabolizing enzymes, along with their encoded proteins. Specifically, three types of invertase and four invertase inhibitors and their encoding polynucleotides from coffee are disclosed. Also disclosed are methods for using these polynucleotides for gene regulation and manipulation of the sugar profile of coffee plants, to influence flavor, aroma, and other features of coffee beans.

FIELD OF ME INVENTION

The present invention relates to the field of agricultural biotechnology. More particularly, the invention relates to enzymes participating in sucrose metabolism in plants, coffee in particular, and the genes and nucleic acid sequences that encodes these enzymes, along with regulatory mechanisms that regulate the sucrose metabolism via these enzymes.

BACKGROUND OF THE INVENTION

Various publications, including patents, published applications and scholarly articles, cited throughout the present specification are incorporated by reference herein, in its entirety. Citations not fully set forth within the specification may be found at the end of the specification.

Sucrose plays an important role in the ultimate aroma and flavor that is delivered by a coffee grain or bean. Sucrose is a major contributor to the total free reducing sugars in coffee, and reducing sugars are important flavor precursors in coffee. During the roasting of coffee grain, reducing sugars will react with amino group containing molecules in a Maillard type reaction, which generates a significant number of products with caramel, sweet and burnt-type aromas and dark colors that are typically associated with coffee flavor (Russwurm, 1969; Holscher and Steinhart, 1995; Badoud, 2000). The highest quality Arabica grain (Coffea Arabica) have been found to have appreciably higher levels of sucrose (between 7.3 and 11.4%) than the lowest quality Robusta grain (Coffea canephora) (between 4 and 5%) (Russwurm, 1969; Illy and Viani, 1995; Chahan et al., 2002; Badoud, 2000). Despite being significantly degraded during roasting, sucrose still remains in the roasted grain at concentrations of 0.4-2.8% dry weight (DW); thereby, contributing directly to coffee sweetness. A clear correlation exists between the level of sucrose in the grain and coffee flavor. Therefore, identifying and isolating the major enzymes responsible for sucrose metabolism and the underlying genetic basis for variations in sucrose metabolism will enable advances in the art of improving coffee quality.

Currently, there are no published reports on the genes or enzymes involved in sucrose metabolism in coffee. However, sucrose metabolism has been studied in tomato Lycopersicon esculentum (a close relative of coffee, both are members of asterid I class), especially during tomato fruit development. An overview of the enzymes directly involved in sucrose metabolism in tomato is shown in FIG. 1 (Nguyen-Quoc et al., 2001). The key reactions in this pathway are (1) the continuous rapid degradation of sucrose in the cytosol by sucrose synthase (SuSy) and cytoplasmic invertase (I), (2) sucrose synthesis by SuSy or sucrose-phosphate synthase (SPS), (3) sucrose hydrolysis in the vacuole or in the apoplast (region external to the plasma membrane, including cell walls, xylem vessels, etc) by acid invertase (vacuolar or cell wall bound) and, (4) the rapid synthesis and breakdown of starch in the amyloplast.

As in other sink organs, the pattern of sucrose unloading is not constant during tomato fruit development. At the early stages of fruit development, sucrose is unloaded intact from the phloem by the symplast pathway (direct connections between cells) and is not degraded to its composite hexoses during unloading. Both the expression and enzyme activity of SuSy are highest at this stage and are directly correlated with sucrose unloading capacity from the phloem (phenomena also called sink strength; Sun, et al., 1992; Zrenner et al., 1995). Later in fruit development, the symplastic connections are lost. Under these conditions of unloading, sucrose is rapidly hydrolyzed outside the fruit cells by the cell wall bound invertase and then the glucose and fructose products are imported into the cells by hexose transporters. Sucrose is subsequently synthesized de novo in the cytoplasm by SuSy or SPS (FIG. 1). SPS catalyses an essentially irreversible reaction in vivo due to its close association with the enzyme sucrose phosphate phosphatase (Echeverria et al., 1997). In parallel to the loss of the symplastic connections, SuSy activity decreases, and eventually becomes undetectable in fruit at the onset of ripening (Robinson et al. 1998; Wang et al. 1993). Therefore, late in the development of tomato fruit, the SPS enzyme, in association with SP, appears as the major enzymes for sucrose synthesis.

Plant invertases have been separated into two groups based on the optimum pH for activity. Invertases of the first group are identified as neutral invertases, which are characterized as having a pH optima in the range of 7-8.5. The neutral invertases have been found to be located in the cytosol of plant cells. Invertases of the second group are identified as acid invertases, which are characterized as having a pH optima for activity between pH 4.5 and 5.5. The acid invertase have been shown to exist in both soluble and insoluble forms (Sturm and Chrispeels, 1990). Insoluble acid invertase is irreversibly and covalently associated with the cell wall; whereas, soluble acid invertase is located in both the vacuole and apoplast.

Research over the past decade has shown that vacuolar as well as cell-wall bound invertase are key enzymes in the regulation of sucrose metabolism during fruit development of various species. Red-fruit species of tomato, such as the commercial species Lycopersicon esculent and the wild species L. pimpinellifolium, for example, do not store high levels of sucrose but, instead, accumulate hexoses in the form of glucose and fructose. Evidence from crosses of red-fruit species with sucrose-accumulating green-fruit species (Yelle et al., 1991) has shown the crucial role of acid invertase in preventing final sucrose accumulation in red-fruited tomato species. Genetic analysis studies have located the locus conferring high levels of soluble solids in L. pimpinellifolium fruit to the known position of vacuolar invertase TIV1 (Tanksley et al., 1996; Grandillo and Tanksley, 1996). A similar conclusion was reached from the analysis of expression of an antisense TIV1 cDNA construction in transgenic tomatoes (Klann et al, 1993; Klann et al., 1996). Thus the vacuolar form of invertase is considered to play a major role in both the regulation of hexose levels in mature fruits and in the regulation of mobilization of sucrose stored in the vacuoles (Klann et al., 1993; Yau and Simon, 2003). The cell wall bound isoforms are believed to be involved in phloem unloading and sucrose partitioning (Scholes et al, 1996).

The importance of cell wall bound invertase has been demonstrated by studies with transgenic tomato (Dickinson et al., 1991) and tobacco (von Schaewen et al., 1990) plants that overexpress cell wall invertase in a constitutive fashion. Elevated levels of invertase activity in such plants caused reduced levels of sucrose transport between sink and source tissues, which lead to stunted growth and overall altered plant morphology. Reduction of extracellular invertase activity has also been shown to have dramatic effects on plant and seed development in various species. Analysis of transgenic carrots with reduced levels of cell wall invertase due to the constitutive expression of an antisense cell wall invertase construct (Tang et al., 1999) has shown dramatic consequences on early plant development as well as on tap root formation during early elongation phase.

Studies of the miniature-1 (mn1) (Lowe and Nelson, 1946) seed mutant in maize, which is characterized by an aberrant pedicel and a drastic reduction in the size of the endosperm, have shown that Mn1 seed locus encodes a cell wall invertase, CWI-2 (Miller and Chourey, 1992; Cheng et al.; 1996). Interestingly, in the mn1 mutant, global acid invertase (vacuolar and cell wall bound) activity is dramatically reduced suggesting coordinate control of both the vacuolar and cell wall enzyme activities.

Because of the importance of sucrose for high quality coffee flavor, a need exists to determine the metabolism of sucrose beans and the interaction of genes involved in that metabolism. There is also a need to identify and isolate the genes that encode these enzymes in coffee, thereby providing genetic and biochemical tools for modifying sucrose production in coffee beans to manipulate the flavor and aroma of the coffee.

SUMMARY OF THE INVENTION

One aspect of the present invention features a nucleic acid molecule isolated from coffee (Coffea spp.) comprising a coding sequence that encodes an invertase or an invertase inhibitor. In one embodiment, the coding sequence encodes an invertase, which may be a cell wall invertase, a vacuolar invertase or a neutral invertase. In specific embodiments, the cell wall invertase comprises a conserved domain having amino acid sequence WECPDF. In various embodiments, the invertase comprises an amino acid sequence greater than 55% identical to SEQ ID NO:9 or SEQ ID NO:13, and preferably comprises SEQ ID NO:9 or SEQ ID NO:13. In exemplary embodiments, the nucleic acid molecule comprises SEQ ID NO:1 or SEQ ID NO:4.

In another embodiment, the invertase is a vacuolar invertase and comprises a conserved domain having amino acid sequence WECVDF. The vacuolar invertase may comprise an amino acid sequence 70% or more identical to SEQ ID NO:10, and preferably comprises SEQ ID NO:10. In an exemplary embodiment, the nucleic acid molecule encoding the vacuolar invertase comprises SEQ ID NO:2.

In another embodiment, the invertase is a neutral invertase, which may comprise an amino acid sequence 84% or more identical to SEQ ID NO:11, and preferably comprises SEQ ID NO:11. In an exemplary embodiment, the nucleic acid molecule encoding the neutral invertase comprises SEQ ID NO:3.

In other embodiments the coding sequence encodes an invertase inhibitor. In certain embodiments, the invertase inhibitor comprises four conserved cysteine residues in its amino acid sequence. The invertase inhibitor may comprise an amino acid sequence that is 25% or more identical to any one of SEQ ID NOS: 13, 14, 15 or 16, and preferably comprises any one of SEQ ID NOS: 13, 14, 15 or 16. In exemplary embodiments, the nucleic acid molecule encoding the invertase inhibitor comprises any one of SEQ ID NOS: 5, 6, 7 or 8.

In certain embodiments, the above-described coding sequence is an open reading frame of a gene. In other embodiments, it is an mRNA molecule produced by transcription of that gene, or a cDNA molecule produced by reverse transcription of the mRNA molecule of claim. Another embodiment is directed to an oligonucleotide between 8 and 100 bases in length, which is complementary to a segment of the foregoing nucleic acid molecule.

Another aspect of the invention features a vector comprising the coding sequence of the invertase or invertase inhibitor encoding nucleic acid molecules described above. In certain embodiments, the vector is an expression vector selected from the group of vectors consisting of plasmid, phagemid, cosmid, baculovirus, bacmid, bacterial, yeast and viral vectors. Various embodiments comprise vectors in which the coding sequence of the nucleic acid molecule is operably linked to a constitutive promoter, or to an inducible promoter, or to a tissue specific promoter, preferably a seed specific promoter in the latter embodiment.

Host cells transformed with any of the above described vectors are also provided in another aspect of the invention. The host cells may be plant cells, bacterial cells, fungal cells, insect cells or mammalian cells. A fertile produced from a transformed plant cell of the invention is also provided.

Another aspect of the invention features a method of modulating flavor or aroma of coffee beans, comprising modulating production or activity of one or more invertase or invertase inhibitor within coffee seeds. In certain embodiments, the method comprises increasing production or activity of the one or more invertase or invertase inhibitors. In certain embodiments, this is accomplished by increasing expression of one or more endogenous invertase or invertase inhibitor genes within the coffee seeds. Other embodiments comprise introducing an invertase- or invertase inhibitor-encoding transgene into the plant.

In a particular embodiment, the method comprises increasing production or activity of one or more invertase inhibitors. In this embodiment, endogenous invertase activity in the plant may be decreased as compared with an equivalent plant in which production or activity of the invertase inhibitor is not increased. Further, the plant may contain more sucrose in its seeds than does an equivalent plant in which production or activity of the invertase inhibitor is not increased.

In other embodiments, the method comprises decreasing production or activity of the one or more invertase or invertase inhibitors. This may be accomplished by introducing a nucleic acid molecule into the coffee that inhibits the expression of one or more of the invertase- or invertase inhibitor-encoding genes. In a particular embodiment, the expression or activity of an invertase is decreased. In this embodiment, the plant may contain more sucrose in its seeds than does an equivalent plant in which production or activity of the invertase is not decreased.

Other features and advantages of the invention will be understood by reference to the drawings, detailed description and examples that follow.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Model for sucrose metabolism in tomato fruit. S (Sucrose) is imported from phloem by a symplastic pathway or is hydrolysed by cell-wall invertase. Glucose and fructose are imported into the cytosol by specific Sugar Transporter Proteins. In cytosol, sucrose is degraded by SS (sucrose synthase) and its re-synthesis is catalysed by SPS (sucrose phosphate synthase) associated with SP (sucrose phosphatase) or SS. Sucrose can be exported in vacuole and hydrolysed by vacuolar invertase. UDP-glucose after modifications can be used for starch synthesis in chromoplast. Abbreviations: G, glucose; F, fructose; F6-P, fructose 6-phosphate; UDP-G, UDP-glucose; G6-P, glucose 6-phosphate; S6-P, sucrose 6-phosphate; I, invertase; SP, sucrose phosphatase, SPP sucrose phosphate synthase.

FIG. 2. Protein sequence alignment of CcInv2 with vacuolar acid invertase proteins. Protein sequences were selected based on BLASTp homology search using CcInv2 (Coffea canephora Invertase 2, SEQ ID NO:10). GenBank accession numbers are P29000 for acid invertase from tomato TIV1 (Lycopersicon esculentum) (SEQ ID NO:17), CAA47636.1 for acid invertase from carrot (Daucus carota) (SEQ ID NO:18), AAQ17074 for acid invertase from potato (Solanum tuberosum) (SEQ ID NO:19) and CAE01318 for inv2 from Coffea arabica (SEQ ID NO:20). Amino acids that differ from that of CcInv2 sequence are colored in gray. The alignment was done using the Clustal W program in the MegAlign Software (Lasergene package, DNA STAR). The amino acid sequence NDPNG is a hallmark of plant acid invertases (βF-motif). The sequence WECVDF is specific for vacuolar invertase.

FIG. 3. Protein sequence alignment of CaInv3 with vacuolar acid invertase proteins. Protein sequences were selected based on BLASTp homology search using CaInv3 (Coffea arabica Invertase 3, SEQ ID NO:11). GenBank accession numbers are NP_(—)567347 for AT NInv (neutral cytoplasmic invertase from A. thaliana) (SEQ ID NO:21), and CAG30577 for LJNInv1 (neutral cytoplasmic invertase from Lotus corniculatus var. japonicus) (SEQ ID NO:22). Alignment was done using the Clustal W program in the MegAlign Software (Lasergene package, DNA STAR). Amino acids that differ from that of CaInv3 sequence are colored in gray.

FIG. 4. Partial protein sequence alignment of CcInv4 with TIV1 and LING acid invertase proteins. Partial protein alignment between CcInv4 (SEQ ID NO:12), TIV1 (vacuolar invertase) (SEQ ID NO:17) and LIN6 (cell wall bound invertase) (SEQ ID NO:23) was done using the Clustal W program in the MegAlign Software (Lasergene package, DNA STAR). GenBank accession numbers are P29000 for TIV1 and AAM28823 for LIN6 from tomato (Lycopersicon esculentum). Amino acids that differ from that of CcInv4 sequence are colored in gray.

FIG. 5. Protein sequence alignment of CcInv1 with cell-wall bound invertase proteins. Protein sequences were selected based on BLASTp homology search using CcInv1 (Coffea canephora Invertase 1, SEQ ID NO:9). GenBank accession numbers are CAB85897 for LIN5 (SEQ ID NO:24), AAM28823 for LIN6 from tomato (Lycopersicon esculentum) (SEQ ID NO:23), CAA49162.1 for DCCWInv invertase from carot (Daucus carota) (SEQ ID NO:25), and CAE01317 for inv1 from Coffea arabica (SEQ ID NO:26). Amino acids that differ from that of CcInv1 sequence are colored in gray. The alignment was done using the Clustal W program in the MegAlign Software (Lasergene package, DNA STAR). The amino acid sequence NDPNG (SEQ ID NO:27) is a hallmark of plant acid invertases (βF-motif). The sequence WECPDF (SEQ ID NO:28) is specific for periplasmic or cell wall-bound invertase.

FIG. 6. Protein sequence alignment of CcInvI with invertase inhibitors proteins. Alignment of CcInvI 1, 2, 3 and 4 proteins (SEQ ID NOS: 13, 14, 15 and 16, respectively) with ZM-InvI (CAC69335.1) from corn (Zea mays) (SEQ ID NO:29) and Nt InvI (AAT01640) from tobacco (Nicotiana tabacum) (SEQ ID NO:30). Amino acids identical to the consensus sequence are colored in gray. Four Conserved Cys residues are noted. The alignment was done using the Clustal W program in the MegAlign Software (Lasergene package, DNA STAR).

FIG. 7. Changes of acid and neutral invertase activity in whole grains (separated from pericarp and locules) during CCCA12 (C. arabica) and FRT05, FRT64 (C. canephora). Coffee cherries at four different maturation stages characterized by size and color have been used for this study, i.e., SG (small green), LG (large green), Y (yellow), and R (red). Enzymatic activities are expressed in μmoles.h⁻¹.mg⁻¹ proteins.

FIG. 8. Tissue-specific expression profile of CcInv1 (cell wall-bound), CcInv2 (vacuolar) (A and B) and CcInv3 (cytoplasmic) invertases in C. canephora (robusta, BP 409) and C. arabica (arabica, T2308) using real-time RT-PCR. Total RNA was isolated from root, flower, leaf and coffee beans harvested at four different maturation stages, i.e., Small-Green (SG), Large-Green (LG), Yellow (Y) and Red (R). For each maturation stage, coffee cherries have been separated from pericarp (P) and grains (G). Total RNA was reverse transcribed and subjected to real-time PCR using TaqMan-MGB probes. Relative amounts were calculated and normalized with respect to rpl39 transcript levels. Data shown represent mean values obtained from three amplification reactions and the error bars indicate the SD of the mean.

FIG. 9. Tissue-specific expression profile of CcInvI1, CcInvI2, CcInvI3 and CcInvI4 invertase inhibitors in C. canephora (robusta, BP409) and C. arabica (arabica, T2308) using real-time RT-PCR. Total RNA was isolated from root, flower, leaf and coffee beans harvested at four different maturation stages, i.e., Small-Green (SG), Large-Green (LG), Yellow (Y), and Red (R). For each maturation stage, coffee cherries have been separated from pericarp (P) and grains (G). Total RNA was reverse transcribed and subjected to real-time PCR using TaqMan-MGB probes. Relative amounts were calculated and normalized with respect to rpl39 transcript levels. The data represent mean values obtained from three amplification reactions and the error bars indicate the SD of the mean.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS Definitions

Various terms relating to the biological molecules and other aspects of the present invention are used through the specification and claims. The terms are presumed to have their customary meaning in the field of molecular biology and biochemistry unless they are specifically defined otherwise herein.

The term “sucrose metabolizing enzyme” refers to enzymes in plants that primarily function to accumulate sucrose or degrade sucrose within the plant and include, for example, sucrose synthase (SuSy), sucrose phosphate synthase (SPS) and sucrose phosphatase (SP), as well as invertases (Inv) of various types, and invertase inhibitors (Inv I). Together, the different sucrose metabolizing enzymes operate to control the metabolism of sucrose as needed by the plant for either storage or for energy needs.

“Isolated” means altered “by the hand of man” from the natural state. If a composition or substance occurs in nature, it has been “isolated” if it has been changed or removed from its original environment, or both. For example, a polynucleotide or a polypeptide naturally present in a living plant or animal is not “isolated,” but the same polynucleotide or polypeptide separated from the coexisting materials of its natural state is “isolated”, as the term is employed herein.

“Polynucleotide”, also referred to as “nucleic acid molecule”, generally refers to any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. “Polynucleotides” include, without limitation single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. In addition, “polynucleotide” refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. The term polynucleotide also includes DNAs or RNAs containing one or more modified bases and DNAs or RNAs with backbones modified for stability or for other reasons. “Modified” bases include, for example, tritylated bases and unusual bases such as inosine. A variety of modifications can be made to DNA and RNA; thus, “polynucleotide” embraces chemically, enzymatically or metabolically modified forms of polynucleotides as typically found in nature, as well as the chemical forms of DNA and RNA characteristic of viruses and cells. “Polynucleotide” also embraces relatively short polynucleotides, often referred to as oligonucleotides.

“Polypeptide” refers to any peptide or protein comprising two or more amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres. “Polypeptide” refers to both short chains, commonly referred to as peptides, oligopeptides or oligomers, and to longer chains, generally referred to as proteins. Polypeptides may contain amino acids other than the 20 gene-encoded amino acids. “Polypeptides” include amino acid sequences modified either by natural processes, such as post-translational processing, or by chemical modification techniques which are well known in the art. Such modifications are well described in basic texts and in more detailed monographs, as well as in a voluminous research literature. Modifications can occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the same type of modification may be present in the same or varying degrees at several sites in a given polypeptide. Also, a given polypeptide may contain many types of modifications. Polypeptides may be branched as a result of ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched and branched cyclic polypeptides may result from natural posttranslational processes or may be made by synthetic methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cystine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. See, for instance, Proteins—Structure and Molecular Properties, 2nd Ed., T. E. Creighton, W. H. Freeman and Company, New York, 1993 and Wold, F., Posttranslational Protein Modifications: Perspectives and Prospects, pgs. 1-12 in Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed., Academic Press, New York, 1983; Seifter et al., “Analysis for Protein Modifications and Nonprotein Cofactors”, Meth Enzymol (1990) 182:626-646 and Rattan et al., “Protein Synthesis: Posttranslational Modifications and Aging”, Ann NY Acad Sci (1992) 663:48-62.

“Variant” as the term is used herein, is a polynucleotide or polypeptide that differs from a reference polynucleotide or polypeptide respectively, but retains essential properties. A typical variant of a polynucleotide differs in nucleotide sequence from another, reference polynucleotide. Changes in the nucleotide sequence of the variant may or may not alter the amino acid sequence of a polypeptide encoded by the reference polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence, as discussed below. A typical variant of a polypeptide differs in amino acid sequence from another, reference polypeptide. Generally, differences are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more substitutions, additions or deletions in any combination. A substituted or inserted amino acid residue may or may not be one encoded by the genetic code. A variant of a polynucleotide or polypeptide may be naturally occurring, such as an allelic variant, or it may be a variant that is not known to occur naturally. Non-naturally occurring variants of polynucleotides and polypeptides may be made by mutagenesis techniques or by direct synthesis.

In reference to mutant plants, the terms “null mutant” or “loss-of-function mutant” are used to designate an organism or genomic DNA sequence with a mutation that causes a gene product to be non-functional or largely absent. Such mutations may occur in the coding and/or regulatory regions of the gene, and may be changes of individual residues, or insertions or deletions of regions of nucleic acids. These mutations may also occur in the coding and/or regulatory regions of other genes which may regulate or control a gene and/or encoded protein, so as to cause the protein to be non-functional or largely absent.

The term “substantially the same” refers to nucleic acid or amino acid sequences having sequence variations that do not materially affect the nature of the protein (i.e. the structure, stability characteristics, substrate specificity and/or biological activity of the protein). With particular reference to nucleic acid sequences, the term “substantially the same” is intended to refer to the coding region and to conserved sequences governing expression, and refers primarily to degenerate codons encoding the same amino acid, or alternate codons encoding conservative substitute amino acids in the encoded polypeptide. With reference to amino acid sequences, the term “substantially the same” refers generally to conservative substitutions and/or variations in regions of the polypeptide not involved in determination of structure or function.

The terms “percent identical” and “percent similar” are also used herein in comparisons among amino acid and nucleic acid sequences. When referring to amino acid sequences, “identity” or “percent identical” refers to the percent of the amino acids of the subject amino acid sequence that have been matched to identical amino acids in the compared amino acid sequence by a sequence analysis program. “Percent similar” refers to the percent of the amino acids of the subject amino acid sequence that have been matched to identical or conserved amino acids. Conserved amino acids are those which differ in structure but are similar in physical properties such that the exchange of one for another would not appreciably change the tertiary structure of the resulting protein. Conservative substitutions are defined in Taylor (1986, J. Theor. Biol. 119:205). When referring to nucleic acid molecules, “percent identical” refers to the percent of the nucleotides of the subject nucleic acid sequence that have been matched to identical nucleotides by a sequence analysis program.

“Identity” and “similarity” can be readily calculated by known methods. Nucleic acid sequences and amino acid sequences can be compared using computer programs that align the similar sequences of the nucleic or amino acids and thus define the differences. In preferred methodologies, the BLAST programs (NCBI) and parameters used therein are employed, and the DNAstar system (Madison, Wis.) is used to align sequence fragments of genomic DNA sequences. However, equivalent alignments and similarity/identity assessments can be obtained through the use of any standard alignment software. For instance, the GCG Wisconsin Package version 9.1, available from the Genetics Computer Group in Madison, Wis., and the default parameters used (gap creation penalty=12, gap extension penalty=4) by that program may also be used to compare sequence identity and similarity.

“Antibodies” as used herein includes polyclonal and monoclonal antibodies, chimeric, single chain, and humanized antibodies, as well as antibody fragments (e.g., Fab, Fab′, F(ab′)₂ and F_(v)), including the products of a Fab or other immunoglobulin expression library. With respect to antibodies, the term, “immunologically specific” or “specific” refers to antibodies that bind to one or more epitopes of a protein of interest, but which do not substantially recognize and bind other molecules in a sample containing a mixed population of antigenic biological molecules. Screening assays to determine binding specificity of an antibody are well known and routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow et al. (Eds.), ANTIBODIES A LABORATORY MANUAL; Cold Spring Harbor Laboratory; Cold Spring Harbor, N.Y. (1988), Chapter 6.

The term “substantially pure” refers to a preparation comprising at least 50-60% by weight the compound of interest (e.g., nucleic acid, oligonucleotide, protein, etc.). More preferably, the preparation comprises at least 75% by weight, and most preferably 90-99% by weight, the compound of interest. Purity is measured by methods appropriate for the compound of interest (e.g. chromatographic methods, agarose or polyacrylamide gel electrophoresis, HPLC analysis, and the like).

With respect to single-stranded nucleic acid molecules, the term “specifically hybridizing” refers to the association between two single-stranded nucleic acid molecules of sufficiently complementary sequence to permit such hybridization under pre-determined conditions generally used in the art (sometimes termed “substantially complementary”). In particular, the term refers to hybridization of an oligonucleotide with a substantially complementary sequence contained within a single-stranded DNA or RNA molecule, to the substantial exclusion of hybridization of the oligonucleotide with single-stranded nucleic acids of non-complementary sequence.

A “coding sequence” or “coding region” refers to a nucleic acid molecule having sequence information necessary to produce a gene product, when the sequence is expressed. The coding sequence may comprise untranslated sequences (e.g., introns or 5′ or 3′ untranslated regions) within translated regions, or may lack such intervening untranslated sequences (e.g., as in cDNA).

“Intron” refers to polynucleotide sequences in a nucleic acid that do not code information related to protein synthesis. Such sequences are transcribed into mRNA, but are removed before translation of the mRNA into a protein.

The term “operably linked” or “operably inserted” means that the regulatory sequences necessary for expression of the coding sequence are placed in a nucleic acid molecule in the appropriate positions relative to the coding sequence so as to enable expression of the coding sequence. By way of example, a promoter is operably linked with a coding sequence when the promoter is capable of controlling the transcription or expression of that coding sequence. Coding sequences can be operably linked to promoters or regulatory sequences in a sense or antisense orientation. The term “operably linked” is sometimes applied to the arrangement of other transcription control elements (e.g. enhancers) in an expression vector.

Transcriptional and translational control sequences are DNA regulatory sequences, such as promoters, enhancers, polyadenylation signals, terminators, and the like, that provide for the expression of a coding sequence in a host cell.

The terms “promoter”, “promoter region” or “promoter sequence” refer generally to transcriptional regulatory regions of a gene, which may be found at the 5′ or 3′ side of the coding region, or within the coding region, or within introns. Typically, a promoter is a DNA regulatory region capable of binding RNA polymerase in a cell and initiating transcription of a downstream (3′ direction) coding sequence. The typical 5′ promoter sequence is bounded at its 3′ terminus by the transcription initiation site and extends upstream (5′ direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter sequence is a transcription initiation site (conveniently defined by mapping with nuclease SI), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase.

A “vector” is a replicon, such as plasmid, phage, cosmid, or virus to which another nucleic acid segment may be operably inserted so as to bring about the replication or expression of the segment.

The term “nucleic acid construct” or “DNA construct” is sometimes used to refer to a coding sequence or sequences operably linked to appropriate regulatory sequences and inserted into a vector for transforming a cell. This term may be used interchangeably with the term “transforming DNA” or “transgene”. Such a nucleic acid construct may contain a coding sequence for a gene product of interest, along with a selectable marker gene and/or a reporter gene.

A “marker gene” or “selectable marker gene” is a gene whose encoded gene product confers a feature that enables a cell containing the gene to be selected from among cells not containing the gene. Vectors used for genetic engineering typically contain one or more selectable marker genes. Types of selectable marker genes include (1) antibiotic resistance genes, (2) herbicide tolerance or resistance genes, and (3) metabolic or auxotrophic marker genes that enable transformed cells to synthesize an essential component, usually an amino acid, which the cells cannot otherwise produce.

A “reporter gene” is also a type of marker gene. It typically encodes a gene product that is assayable or detectable by standard laboratory means (e.g., enzymatic activity, fluorescence).

The term “express,” “expressed,” or “expression” of a gene refers to the biosynthesis of a gene product. The process involves transcription of the gene into mRNA and then translation of the mRNA into one or more polypeptides, and encompasses all naturally occurring post-translational modifications.

“Endogenous” refers to any constituent, for example, a gene or nucleic acid, or polypeptide, that can be found naturally within the specified organism.

A “heterologous” region of a nucleic acid construct is an identifiable segment (or segments) of the nucleic acid molecule within a larger molecule that is not found in association with the larger molecule in nature. Thus, when the heterologous region comprises a gene, the gene will usually be flanked by DNA that does not flank the genomic DNA in the genome of the source organism. In another example, a heterologous region is a construct where the coding sequence itself is not found in nature (e.g., a cDNA where the genomic coding sequence contains introns, or synthetic sequences having codons different than the native gene). Allelic variations or naturally-occurring mutational events do not give rise to a heterologous region of DNA as defined herein. The term “DNA construct”, as defined above, is also used to refer to a heterologous region, particularly one constructed for use in transformation of a cell.

A cell has been “transformed” or “transfected” by exogenous or heterologous DNA when such DNA has been introduced inside the cell. The transforming DNA may or may not be integrated (covalently linked) into the genome of the cell. In prokaryotes, yeast, and mammalian cells for example, the transforming DNA may be maintained on an episomal element such as a plasmid. With respect to eukaryotic cells, a stably transformed cell is one in which the transforming DNA has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication. This stability is demonstrated by the ability of the eukaryotic cell to establish cell lines or clones comprised of a population of daughter cells containing the transforming DNA. A “clone” is a population of cells derived from a single cell or common ancestor by mitosis. A “cell line” is a clone of a primary cell that is capable of stable growth in vitro for many generations.

“Grain,” “seed,” or “bean,” refers to a flowering plant's unit of reproduction, capable of developing into another such plant. As used herein, especially with respect to coffee plants, the terms are used synonymously and interchangeably.

As used herein, the term “plant” includes reference to whole plants, plant organs (e.g., leaves, stems, shoots, roots), seeds, pollen, plant cells, plant cell organelles, and progeny thereof. Parts of transgenic plants are to be understood within the scope of the invention to comprise, for example, plant cells, protoplasts, tissues, callus, embryos as well as flowers, stems, seeds, pollen, fruits, leaves, or roots originating in transgenic plants or their progeny.

DESCRIPTION

Sucrose is a major contributor of free reducing sugars involved in the Maillard reaction that occurs during the roasting of coffee grain. Therefore, it is widely believed to be an important flavor precursor molecule in the green coffee grain. Consistent with this idea, the highest quality Arabica grains have appreciably higher levels of sucrose (between 7.3 and 11.4%) than the lowest quality Robusta grains (between 4 and 5%). Also, sucrose, while being significantly degraded during roasting, can remain in the roasted grain at concentrations of 0.4-2.8% dry weight (DW) and so participates directly in coffee's sweetness. Because of the clear correlation between the level of sucrose in the grain and coffee flavor, the ability to understand and manipulate the underlying genetic basis for variations in sucrose metabolism and carbon partitioning in coffee grain is important.

Key enzymes involved in sucrose metabolism have been characterized in model organisms (e.g., tomato, potato, Arabidopsis). In accordance with the present invention, protein sequences of these enzymes have been used to perform similarity searches in Coffea canephora and C. Arabica cDNA libraries and EST databases using the tBLASTn algorithm, as described in greater detail in the examples. Full-length cDNAs encoding CcInv1 (cell wall bound invertase), CcInv2 (vacuolar invertase) and CaInv3 (cytoplasmic invertase) were isolated. A partial cDNA sequence (CcInv4) was also isolated, and is believed to represent a cell wall bound invertase). In addition, four full-length cDNA sequences encoding likely invertase inhibitors CcInvI1, CcInvI2, CcInvI3 and CcInvI4 have been identified and characterized.

One aspect of the present invention relates to nucleic acid molecules from coffee that encode a variety of invertases: cell wall invertase CcInv1 (SEQ ID NO:1) and CcInv4 (SEQ ID NO:4—partial sequence), vacuolar invertase CcInv2 (SEQ ID NO. 2), and neutral invertase CaInv3 (SEQ ID NO. 3), and four full length invertase inhibitors: CcInvI1 (SEQ ID NO. 5), CcInvI2 (SEQ ID NO. 6), CcInvI3 (SEQ ID NO. 7), and CcInvI4 (SEQ ID NO. 8).

Another aspect of the invention relate to the proteins produced by expression of these nucleic acid molecules and their uses. The deduced amino acid sequences of the proteins produced by expression of SEQ ID NOS: 1, 2, 3, 4, 5, 6, 7 and 8 are set forth herein as SEQ NOS: 9, 10, 11, 12, 13, 14, 15, and 16, respectively. Still other aspects of the invention relate to uses of the nucleic acid molecules and encoded polypeptides in plant breeding and in genetic manipulation of plants, and ultimately in the manipulation of coffee flavor, aroma and other qualities.

Although polynucleotides encoding invertase and invertase inhibitors from Coffea canephora are described and exemplified herein, this invention is intended to encompass nucleic acids and encoded proteins from other Coffea species that are sufficiently similar to be used interchangeably with the C. canephora polynucleotides and proteins for the purposes described below. Accordingly, when the terms “invertase” and “invertase inhibitor” are used herein, they are intended to encompass all Coffea invertases and invertase inhibitors having the general physical, biochemical and functional features described herein, and polynucleotides encoding them, unless specifically stated otherwise.

Considered in terms of their sequences, invertase or invertase inhibitor polynucleotides of the invention include allelic variants and natural mutants of SEQ ID NOS: 1, 2, 3, 4, 5, 6, 7, and 8, which are likely to be found in different varieties of C. canephora, and homologs of SEQ ID NOS: 9, 10, 11, 12, 13, 14, 15, and 16 are likely to be found in different coffee species. Because such variants and homologs are expected to possess certain differences in nucleotide and amino acid sequence, this invention provides (1) isolated invertase-encoding nucleic acid molecules that encode respective polypeptides having at least about 70% (and, with increasing order of preference, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 70%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% and 99%) identity with the encoded polypeptide of any one of SEQ ID NOS: 9, 10, 11 or 12, and (2) isolated invertase inhibitor-encoding nucleic acid molecules that encode respective polypeptides having at least about 25% (and, with increasing order of preference, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70% 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 70%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% and 99%) identity with the encoded polypeptide of any one of SEQ ID NOS: 13, 14, 15, or 16 and comprises a nucleotide sequence having equivalent ranges of identity to any one of SEQ ID NOS: 1, 2, 3, 4, 5, 6, 7 or 8, respectively. Because of the natural sequence variation likely to exist among invertases and invertase inhibitors, and the genes encoding them in different coffee varieties and species, one skilled in the art would expect to find this level of variation, while still maintaining the unique properties of the polypeptides and polynucleotides of the present invention. Such an expectation is due in part to the degeneracy of the genetic code, as well as to the known evolutionary success of conservative amino acid sequence variations, which do not appreciably alter the nature of the encoded protein. Accordingly, such variants and homologs are considered substantially the same as one another and are included within the scope of the present invention.

The following sections set forth the general procedures involved in practicing the present invention. To the extent that specific materials are mentioned, it is merely for the purpose of illustration, and is not intended to limit the invention. Unless otherwise specified, general biochemical and molecular biological procedures, such as those set forth in Sambrook et al., Molecular Cloning, Cold Spring Harbor Laboratory (1989) or Ausubel et al. (eds), Current Protocols in Molecular Biology, John Wiley & Sons (2005) are used.

Nucleic Acid Molecules, Proteins and Antibodies:

Nucleic acid molecules of the invention may be prepared by two general methods: (1) they may be synthesized from appropriate nucleotide triphosphates, or (2) they may be isolated from biological sources. Both methods utilize protocols well known in the art.

The availability of nucleotide sequence information, such as the cDNA having SEQ ID NOS: 1, 2, 3, 4, 5, 6, 7 or 8 enables preparation of an isolated nucleic acid molecule of the invention by oligonucleotide synthesis. Synthetic oligonucleotides may be prepared by the phosphoramidite method employed in the Applied Biosystems 38A DNA Synthesizer or similar devices. The resultant construct may be purified according to methods known in the art, such as high performance liquid chromatography (HPLC). Long, double-stranded polynucleotides, such as a DNA molecule of the present invention, must be synthesized in stages, due to the size limitations inherent in current oligonucleotide synthetic methods. Thus, for example, a long double-stranded molecule may be synthesized as several smaller segments of appropriate complementarity. Complementary segments thus produced may be annealed such that each segment possesses appropriate cohesive termini for attachment of an adjacent segment. Adjacent segments may be ligated by annealing cohesive termini in the presence of DNA ligase to construct an entire long double-stranded molecule. A synthetic DNA molecule so constructed may then be cloned and amplified in an appropriate vector.

In accordance with the present invention, nucleic acids having the appropriate level of sequence homology with part or all of the coding and/or regulatory regions of invertase or invertase inhibitor polynucleotides may be identified by using hybridization and washing conditions of appropriate stringency. It will be appreciated by those skilled in the art that the aforementioned strategy, when applied to genomic sequences, will, in addition to enabling isolation of sucrose metabolizing enzyme-coding sequences, also enable isolation of promoters and other gene regulatory sequences associated with sucrose metabolizing enzyme genes, even though the regulatory sequences themselves may not share sufficient homology to enable suitable hybridization.

As a typical illustration, hybridizations may be performed, according to the method of Sambrook et al., using a hybridization solution comprising: 5×SSC, 5×Denhardt's reagent, 1.0% SDS, 100 μg/ml denatured, fragmented salmon sperm DNA, 0.05% sodium pyrophosphate and up to 50% formamide. Hybridization is carried out at 37-42° C. for at least six hours. Following hybridization, filters are washed as follows: (1) 5 minutes at room temperature in 2×SSC and 1% SDS; (2) 15 minutes at room temperature in 2×SSC and 0.1% SDS; (3) 30 minutes-1 hour at 37° C. in 2×SSC and 0.1% SDS; (4) 2 hours at 45-55° C. in 2×SSC and 0.1% SDS, changing the solution every 30 minutes.

One common formula for calculating the stringency conditions required to achieve hybridization between nucleic acid molecules of a specified sequence homology (Sambrook et al., 1989):

Tm=81.5° C.+16.6 Log [Na+]+0.41(% G+C)−0.63(% formamide)−600/#bp in duplex

As an illustration of the above formula, using [Na+]=[0.368] and 50% formamide, with GC content of 42% and an average probe size of 200 bases, the Tm is 57° C. The Tm of a DNA duplex decreases by 1-1.5° C. with every 1% decrease in homology. Thus, targets with greater than about 75% sequence identity would be observed using a hybridization temperature of 42° C. In one embodiment, the hybridization is at 37° C. and the final wash is at 42° C.; in another embodiment the hybridization is at 42° C. and the final wash is at 50° C.; and in yet another embodiment the hybridization is at 42° C. and final wash is at 65° C., with the above hybridization and wash solutions. Conditions of high stringency include hybridization at 42° C. in the above hybridization solution and a final wash at 65° C. in 0.1×SSC and 0.1% SDS for 10 minutes.

Nucleic acids of the present invention may be maintained as DNA in any convenient cloning vector. In a preferred embodiment, clones are maintained in plasmid cloning/expression vector, such as pGEM-T (Promega Biotech, Madison, Wis.), pBluescript (Stratagene, La Jolla, Calif.), pCR4-TOPO (Invitrogen, Carlsbad, Calif.) or pET28a+ (Novagen, Madison, Wis.), all of which can be propagated in a suitable E. coli host cell.

Nucleic acid molecules of the invention include cDNA, genomic DNA, RNA, and fragments thereof which may be single-, double-, or even triple-stranded. Thus, this invention provides oligonucleotides (sense or antisense strands of DNA or RNA) having sequences capable of hybridizing with at least one sequence of a nucleic acid molecule of the present invention. Such oligonucleotides are useful as probes for detecting invertase or invertase inhibitor encoding genes or mRNA in test samples of plant tissue, e.g., by PCR amplification, or for the positive or negative regulation of expression of invertase or invertase inhibitor encoding genes at or before translation of the mRNA into proteins. Methods in which invertase or invertase inhibitor encoding oligonucleotides or polynucleotides may be utilized as probes for such assays include, but are not limited to: (1) in situ hybridization; (2) Southern hybridization (3) northern hybridization; and (4) assorted amplification reactions such as polymerase chain reactions (PCR, including RT-PCR) and ligase chain reaction (LCR).

The oligonucleotides having sequences capable of hybridizing with at least one sequence of a nucleic acid molecule of the present invention include antisense oligonucleotides. The antisense oligonucleotides are targeted to specific regions of the mRNA that are critical for translation may be utilized. The use of antisense molecules to decrease expression levels of a pre-determined gene is known in the art. Antisense molecules may be provided in situ by transforming plant cells with a DNA construct which, upon transcription, produces the antisense RNA sequences. Such constructs can be designed to produce full length or partial antisense sequences. This gene silencing effect can be enhanced by transgenically over-producing both sense and antisense RNA of the gene coding sequence so that a high amount of dsRNA is produced (for example see Waterhouse et al., 1998, PNAS 95: 13959-13964). In this regard, dsRNA containing sequences that correspond to part or all of at least one intron have been found particularly effective. In one embodiment, part or all of the sucrose invertase-encoding sequence antisense strand is expressed by a transgene. In another embodiment, hybridizing sense and antisense strands of part or all of the invertase-encoding sequence are transgenically expressed. In another embodiment, invertase-genes may be silenced by use of small interfering RNA (siRNA; Elbashir et al., 2001, Genes Dev. 15(2):188-200) using commercially available materials and methods (e.g., Invitrogen, Inc., Carlsbad Calif.). Preferably, the antisense oligonucleotides recognize and silence invertase mRNA or invertase expression.

Polypeptides encoded by nucleic acids of the invention may be prepared in a variety of ways, according to known methods. If produced in situ the polypeptides may be purified from appropriate sources, e.g., seeds, pericarps, or other plant parts.

Alternatively, the availability of nucleic acid molecules encoding the polypeptides enables production of the proteins using in vitro expression methods known in the art. For example, a cDNA or gene may be cloned into an appropriate in vitro transcription vector, such a pSP64 or pSP65 for in vitro transcription, followed by cell-free translation in a suitable cell-free translation system, such as wheat germ or rabbit reticulocytes. In vitro transcription and translation systems are commercially available, e.g., from Promega Biotech, Madison, Wis., BRL, Rockville, Md. or Invitrogen, Carlsbad, Calif.

According to a preferred embodiment, larger quantities of polypeptides may be produced by expression in a suitable procaryotic or eucaryotic system. For example, part or all of a DNA molecule, such as the cDNAs having SEQ ID NOS: 1, 2, 3, 4, 5, 6, 7 or 8 may be inserted into a plasmid vector adapted for expression in a bacterial cell (such as E. coli) or a yeast cell (such as Saccharomyces cerevisiae), or into a baculovirus vector for expression in an insect cell. Such vectors comprise the regulatory elements necessary for expression of the DNA in the host cell, positioned in such a manner as to permit expression of the DNA in the host cell. Such regulatory elements required for expression include promoter sequences, transcription initiation sequences and, optionally, enhancer sequences.

The polypeptides produced by gene expression in a recombinant procaryotic or eucyarotic system may be purified according to methods known in the art. In a preferred embodiment, a commercially available expression/secretion system can be used, whereby the recombinant protein is expressed and thereafter secreted from the host cell, and, thereafter, purified from the surrounding medium. An alternative approach involves purifying the recombinant protein by affinity separation, e.g., via immunological interaction with antibodies that bind specifically to the recombinant protein.

The polypeptides of the invention, prepared by the aforementioned methods, may be analyzed according to standard procedures.

Polypeptides purified from coffee or recombinantly produced, may be used to generate polyclonal or monoclonal antibodies, antibody fragments or derivatives as defined herein, according to known methods. In addition to making antibodies to the entire recombinant protein, if analyses of the proteins or Southern and cloning analyses (see below) indicate that the cloned genes belongs to a multigene family, then member-specific antibodies made to synthetic peptides corresponding to nonconserved regions of the protein can be generated.

Kits comprising an antibody of the invention for any of the purposes described herein are also included within the scope of the invention. In general, such a kit includes a control antigen for which the antibody is immunospecific.

Vectors, Cells, Tissues and Plants:

Also featured in accordance with the present invention are vectors and kits for producing transgenic host cells that contain a invertase or invertase inhibitor encoding polynucleotide or oligonucleotide, or variants thereof in a sense or antisense orientation, or reporter gene and other constructs under control of sucrose metabolizing enzyme-promoters and other regulatory sequences. Suitable host cells include, but are not limited to, plant cells, bacterial cells, yeast and other fungal cells, insect cells and mammalian cells. Vectors for transforming a wide variety of these host cells are well known to those of skill in the art. They include, but are not limited to, plasmids, cosmids, baculoviruses, bacmids, bacterial artificial chromosomes (BACs), yeast artificial chromosomes (YACs), as well as other bacterial, yeast and viral vectors. Typically, kits for producing transgenic host cells will contain one or more appropriate vectors and instructions for producing the transgenic cells using the vector. Kits may further include one or more additional components, such as culture media for culturing the cells, reagents for performing transformation of the cells and reagents for testing the transgenic cells for gene expression, to name a few.

The present invention includes transgenic plants comprising one or more copies of a invertase- or invertase inhibitor-encoding gene, or nucleic acid sequences that inhibit the production or function of a plant's endogenous invertase. This is accomplished by transforming plant cells with a transgene that comprises part of all of a invertase or invertase inhibitor coding sequence, or mutant, antisense or variant thereof, including RNA, controlled by either native or recombinant regulatory sequences, as described below. Transgenic plants coffee species are preferred, including, without limitation, C. abeokutae, C. arabica, C. arnoldiana, C. aruwemiensis, C. bengalensis, C. canephora, C. congensis C. Dewevrei, C. excelsa, C. eugenioides, and C. heterocalyx, C. kapakata, C. khasiana, C. liberica, C. moloundou, C. rasemosa, C. salvatrix, C. sessiflora, C. stenophylla, C. travencorensis, C. wightiana and C. zanguebariae. Plants of any species are also included in the invention; these include, but are not limited to, tobacco, Arabidopsis and other “laboratory-friendly” species, cereal crops such as maize, wheat, rice, soybean barley, rye, oats, sorghum, alfalfa, clover and the like, oil-producing plants such as canola, safflower, sunflower, peanut, cacao and the like, vegetable crops such as tomato tomatillo, potato, pepper, eggplant, sugar beet, carrot, cucumber, lettuce, pea and the like, horticultural plants such as aster, begonia, chrysanthemum, delphinium, petunia, zinnia, lawn and turfgrasses and the like.

Transgenic plants can be generated using standard plant transformation methods known to those skilled in the art. These include, but are not limited to, Agrobacterium vectors, polyethylene glycol treatment of protoplasts, biolistic DNA delivery, UV laser microbeam, gemini virus vectors or other plant viral vectors, calcium phosphate treatment of protoplasts, electroporation of isolated protoplasts, agitation of cell suspensions in solution with microbeads coated with the transforming DNA, agitation of cell suspension in solution with silicon fibers coated with transforming DNA, direct DNA uptake, liposome-mediated DNA uptake, and the like. Such methods have been published in the art. See, e.g., Methods for Plant Molecular Biology (Weissbach & Weissbach, eds., 1988); Methods in Plant Molecular Biology (Schuler & Zielinski, eds., 1989); Plant Molecular Biology Manual (Gelvin, Schilperoort, Verma, eds., 1993); and Methods in Plant Molecular Biology—A Laboratory Manual (Maliga, Klessig, Cashmore, Gruissem & Varner, eds., 1994).

The method of transformation depends upon the plant to be transformed. Agrobacterium vectors are often used to transform dicot species. Agrobacterium binary vectors include, but are not limited to, BIN19 and derivatives thereof, the pBI vector series, and binary vectors pGA482, pGA492, pLH7000 (GenBank Accession AY234330) and any suitable one of the pCAMBIA vectors (derived from the pPZP vectors constructed by Hajdukiewicz, Svab & Maliga, (1994) Plant Mol Biol 25: 989-994, available from CAMBIA, GPO Box 3200, Canberra ACT 2601, Australia or via the worldwide web at CAMBIA.org). For transformation of monocot species, biolistic bombardment with particles coated with transforming DNA and silicon fibers coated with transforming DNA are often useful for nuclear transformation. Alternatively, Agrobacterium “superbinary” vectors have been used successfully for the transformation of rice, maize and various other monocot species.

DNA constructs for transforming a selected plant comprise a coding sequence of interest operably linked to appropriate 5′ regulatory sequences (e.g., promoters and translational regulatory sequences) and 3′ regulatory sequences (e.g., terminators). In a preferred embodiment, a dehydrin or LEA protein coding sequence under control of its natural 5′ and 3′ regulatory elements is utilized. In other embodiments, dehydrin or LEA protein coding and regulatory sequences are swapped (e.g., CcLEA1 coding sequence operably linked to CcDH2 promoter) to alter the water or protein content of the seed of the transformed plant for a phenotypic improvement, e.g., in flavor, aroma or other feature.

In an alternative embodiment, the coding region of the gene is placed under a powerful constitutive promoter, such as the Cauliflower Mosaic Virus (CaMV) 35S promoter or the figwort mosaic virus 35S promoter. Other constitutive promoters contemplated for use in the present invention include, but are not limited to: T-DNA mannopine synthetase, nopaline synthase and octopine synthase promoters. In other embodiments, a strong monocot promoter is used, for example, the maize ubiquitin promoter, the rice actin promoter or the rice tubulin promoter (Jeon et al., Plant Physiology. 123: 1005-14, 2000).

Transgenic plants expressing invertase or invertase inhibitor coding sequences under an inducible promoter are also contemplated to be within the scope of the present invention. Inducible plant promoters include the tetracycline repressor/operator controlled promoter, the heat shock gene promoters, stress (e.g., wounding)-induced promoters, defense responsive gene promoters (e.g. phenylalanine ammonia lyase genes), wound induced gene promoters (e.g. hydroxyproline rich cell wall protein genes), chemically-inducible gene promoters (e.g., nitrate reductase genes, glucanase genes, chitinase genes, etc.) and dark-inducible gene promoters (e.g., asparagine synthetase gene) to name a few.

Tissue specific and development-specific promoters are also contemplated for use in the present invention, in addition to the seed-specific dehydrin or LEA protein promoters of the invention. Non-limiting examples of other seed-specific promoters include Cim1 (cytokinin-induced message), cZ19B1 (maize 19 kDa zein), milps (myo-inositol-1-phosphate synthase), and celA (cellulose synthase) (U.S. application Ser. No. 09/377,648), bean beta-phaseolin, napin, beta-conglycinin, soybean lectin, cruciferin, maize 15 kDa zein, 22 kDa zein, 27 kDa zein, g-zein, waxy, shrunken 1, shrunken 2, and globulin 1, soybean 11S legumin (Bäumlein et al., 1992), and C. canephora 11S seed storage protein (Marraccini et al., 1999, Plant Physiol. Biochem. 37: 273-282). See also WO 00/12733, where seed-preferred promoters from end1 and end2 genes are disclosed. Other Coffea seed specific promoters may also be utilized, including but not limited to the oleosin gene promoter described in commonly-owned, co-pending PCT Application No. [NOT YET ASSIGNED] and the dehydrin gene promoter described in commonly-owned, co-pending PCT Application No. [NOT YET ASSIGNED]. Examples of other tissue-specific promoters include, but are not limited to: the ribulose bisphosphate carboxylase (RuBisCo) small subunit gene promoters (e.g., the coffee small subunit promoter as described by Marracini et al., 2003) or chlorophyll a/b binding protein (CAB) gene promoters for expression in photosynthetic tissue; and the root-specific glutamine synthetase gene promoters where expression in roots is desired.

The coding region is also operably linked to an appropriate 3′ regulatory sequence. In embodiments where the native 3′ regulatory sequence is not use, the nopaline synthetase polyadenylation region may be used. Other useful 3′ regulatory regions include, but are not limited to the octopine synthase polyadenylation region.

The selected coding region, under control of appropriate regulatory elements, is operably linked to a nuclear drug resistance marker, such as kanamycin resistance. Other useful selectable marker systems include genes that confer antibiotic or herbicide resistances (e.g., resistance to hygromycin, sulfonylurea, phosphinothricin, or glyphosate) or genes conferring selective growth (e.g., phosphomannose isomerase, enabling growth of plant cells on mannose). Selectable marker genes include, without limitation, genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO), dihydrofolate reductase (DHFR) and hygromycin phosphotransferase (HPT), as well as genes that confer resistance to herbicidal compounds, such as glyphosate-resistant EPSPS and/or glyphosate oxidoreducatase (GOX), Bromoxynil nitrilase (BXN) for resistance to bromoxynil, AHAS genes for resistance to imidazolinones, sulfonylurea resistance genes, and 2,4-dichlorophenoxyacetate (2,4-D) resistance genes.

In certain embodiments, promoters and other expression regulatory sequences encompassed by the present invention are operably linked to reporter genes. Reporter genes contemplated for use in the invention include, but are not limited to, genes encoding green fluorescent protein (GFP), red fluorescent protein (DsRed), Cyan Fluorescent Protein (CFP), Yellow Fluorescent Protein (YFP), Cerianthus Orange Fluorescent Protein (cOFP), alkaline phosphatase (AP), β-lactamase, chloramphenicol acetyltransferase (CAT), adenosine deaminase (ADA), aminoglycoside phosphotransferase (neo^(r), G418^(r)) dihydrofolate reductase (DHFR), hygromycin-B-phosphotransferase (HPH), thymidine kinase (TK), lacZ (encoding α-galactosidase), and xanthine guanine phosphoribosyltransferase (XGPRT), Beta-Glucuronidase (gus), Placental Alkaline Phosphatase (PLAP), Secreted Embryonic Alkaline Phosphatase (SEAP), or Firefly or Bacterial Luciferase (LUC). As with many of the standard procedures associated with the practice of the invention, skilled artisans will be aware of additional sequences that can serve the function of a marker or reporter.

Additional sequence modifications are known in the art to enhance gene expression in a cellular host. These modifications include elimination of sequences encoding superfluous polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. Alternatively, if necessary, the G/C content of the coding sequence may be adjusted to levels average for a given coffee plant cell host, as calculated by reference to known genes expressed in a coffee plant cell. Also, when possible, the coding sequence is modified to avoid predicted hairpin secondary mRNA structures. Another alternative to enhance gene expression is to use 5′ leader sequences. Translation leader sequences are well known in the art, and include the cis-acting derivative (omega′) of the 5′ leader sequence (omega) of the tobacco mosaic virus, the 5′ leader sequences from brome mosaic virus, alfalfa mosaic virus, and turnip yellow mosaic virus.

Plants are transformed and thereafter screened for one or more properties, including the presence of the transgene product, the transgene-encoding mRNA, or an altered phenotype associated with expression of the transgene. It should be recognized that the amount of expression, as well as the tissue- and temporal-specific pattern of expression of the transgenes in transformed plants can vary depending on the position of their insertion into the nuclear genome. Such positional effects are well known in the art. For this reason, several nuclear transformants should be regenerated and tested for expression of the transgene.

Methods:

The nucleic acids and polypeptides of the present invention can be used in any one of a number of methods whereby the protein products can be expressed in coffee plants in order that the proteins may play a role in the enhancement of the flavor and/or aroma of the coffee beverage or coffee products ultimately produced from the bean of the coffee plant expressing the protein.

There is a strong correlation between the sucrose concentration in green beans and high quality coffee (Russwurm, 1969; Holscher and Steinhart, 1995; Badoud, 2000; fly and Viani, 1995; Leloup et al., 2003). Improvement of coffee grain sucrose content can be obtained by (1) classical breeding or (2) genetic engineering techniques, and by combining these two approaches. Both approaches have been considerably improved by the isolation and characterization of sucrose metabolism-related genes in coffee, in accordance with the present invention. For example, the sucrose metabolism enzyme-encoding genes may be genetically mapped and Quantitative Trait Loci (QTL) involved in coffee flavor can be identified. It would be then be possible to determine if such QTL correlate with the position of sucrose related genes. Alleles (haplotypes), for genes affecting sucrose metabolism may also be identified and examined to determine if the presence of specific haplotypes are strongly correlated with high sucrose. These “high sucrose” markers can be used to advantage in marker assisted breeding programs. A third advantage of isolating polynucleotides involved in sucrose metabolism is to generate expression data for these genes during coffee bean maturation in varieties with high and low sucrose levels, examples of which are discussed in the Examples, below. This information is used to direct the choice of genes to use in genetic manipulation aimed at generating novel transgenic coffee plants that have increased sucrose levels in the mature bean, as described in detail below.

In one aspect, the present invention features methods to alter the sucrose metabolizing enzyme profile, or sugar profile, in a plant, preferably coffee, comprising increasing or decreasing an amount or activity of one or more sucrose metabolizing enzymes in the plant. Specific embodiments of the present invention provide methods for altering the sugar profile of a plant by increasing or decreasing production of invertases or invertase inhibitors.

The data produced in accordance with the present invention strongly indicate that a decrease in invertase activity (acid or neutral invertases) at the final stages of coffee grain maturation will lead to increased sucrose accumulation in the grain. Accordingly, one preferred embodiment of the present invention comprises transforming coffee plants with an invertase inhibitor-encoding polynucleotide, such as a cDNA corresponding to SEQ ID NO: 5, 6, 7 or 8, for the purpose of over-producing that inhibitor in various tissues of coffee. In one embodiment, coffee plants are engineered for a general increase in invertase inhibitor production, e.g., through the use of a promoter such as the RuBisCo small subunit (SSU) promoter or the CaMV35S promoter functionally linked to an invertase inhibitor gene. In another embodiment designed to limit overproduction of the invertase inhibitor only to the sink organ of interest, i.e., the grain, a grain-specific promoter may be utilized, particularly one of the Coffea grain-specific promoters described above.

The sucrose profile of a plant may be enhanced by modulating the production, or activity, of one or more invertase or invertase inhibitor in the plant, such as coffee. Additionally, plants expressing enhanced sucrose levels may be screened for naturally-occurring variants of the invertase or invertase inhibitor. For instance, loss-of-function (null) mutant plants may be created or selected from populations of plant mutants currently available. It will also be appreciated by those of skill in the art that mutant plant populations may also be screened for mutants that under or over-express a particular sucrose metabolizing enzyme, utilizing one or more of the methods described herein. Mutant populations can be made by chemical mutagenesis, radiation mutagenesis, and transposon or T-DNA insertions, or targeting induced local lesions in genomes (TILLING, see, e.g., Henikoff et al., 2004, Plant Physiol. 135(2): 630-636; Gilchrist & Haughn, 2005, Curr. Opin. Plant Biol. 8(2): 211-215). The methods to make mutant populations are well known in the art.

The nucleic acids of the invention can be used to identify mutant forms of sucrose metabolizing enzymes in various plant species. In species such as maize or Arabidopsis, where transposon insertion lines are available, oligonucleotide primers can be designed to screen lines for insertions in the invertase or invertase inhibito rgenes. Through breeding, a plant line may then be developed that is heterozygous or homozygous for the interrupted gene.

A plant also may be engineered to display a phenotype similar to that seen in null mutants created by mutagenic techniques. A transgenic null mutant can be created by a expressing a mutant form of a selected invertase protein to create a “dominant negative effect.” While not limiting the invention to any one mechanism, this mutant protein will compete with wild-type protein for interacting proteins or other cellular factors. Examples of this type of “dominant negative” effect are well known for both insect and vertebrate systems (Radke et al, 1997, Genetics 145: 163-171; Kolch et al., 1991, Nature 349: 426-428).

Another kind of transgenic null mutant can be created by inhibiting the translation of sucrose metabolizing enzyme-encoding mRNA by “post-transcriptional gene silencing.” These techniques may be used to advantage to down-regulate invertases in a plant grain, thereby promoting sucrose accumulation. For instance, an invertase-encoding gene from the species targeted for down-regulation, or a fragment thereof, may be utilized to control the production of the encoded protein. Full-length antisense molecules can be used for this purpose. Alternatively, antisense oligonucleotides targeted to specific regions of the mRNA that are critical for translation may be utilized. The use of antisense molecules to decrease expression levels of a pre-determined gene is known in the art. Antisense molecules may be provided in situ by transforming plant cells with a DNA construct which, upon transcription, produces the antisense RNA sequences. Such constructs can be designed to produce full-length or partial antisense sequences. This gene silencing effect can be enhanced by transgenically over-producing both sense and antisense RNA of the gene coding sequence so that a high amount of dsRNA is produced (for example see Waterhouse et al., 1998, PNAS 13959-13964). In this regard, dsRNA containing sequences that correspond to part or all of at least one intron have been found particularly effective. In one embodiment, part or all of the invertase-encoding sequence antisense strand is expressed by a transgene. In another embodiment, hybridizing sense and antisense strands of part or all of the coding sequence are transgenically expressed.

In another embodiment, genes may be silenced through the use of a variety of other post-transcriptional gene silencing (RNA silencing) techniques that are currently available for plant systems. RNA silencing involves the processing of double-stranded RNA (dsRNA) into small 21-28 nucleotide fragments by an RNase H-based enzyme (“Dicer” or “Dicer-like”). The cleavage products, which are siRNA (small interfering RNA) or miRNA (micro-RNA) are incorporated into protein effector complexes that regulate gene expression in a sequence-specific manner (for reviews of RNA silencing in plants, see Horiguchi, 2004, Differentiation 72: 65-73; Baulcombe, 2004, Nature 431: 356-363; Herr, 2004, Biochem. Soc. Trans. 32: 946-951).

Small interfering RNAs may be chemically synthesized or transcribed and amplified in vitro, and then delivered to the cells. Delivery may be through microinjection (Tuschl T et al., 2002), chemical transfection (Agrawal N et al., 2003), electroporation or cationic liposome-mediated transfection (Brummelkamp T R et al., 2002; Elbashir S M et al., 2002), or any other means available in the art, which will be appreciated by the skilled artisan. Alternatively, the siRNA may be expressed intracellularly by inserting DNA templates for siRNA into the cells of interest, for example, by means of a plasmid, (Tuschl T et al., 2002), and may be specifically targeted to select cells. Small interfering RNAs have been successfully introduced into plants. (Klahre U et al., 2002).

A preferred method of RNA silencing in the present invention is the use of short hairpin RNAs (shRNA). A vector containing a DNA sequence encoding for a particular desired siRNA sequence is delivered into a target cell by an common means. Once in the cell, the DNA sequence is continuously transcribed into RNA molecules that loop back on themselves and form hairpin structures through intramolecular base pairing. These hairpin structures, once processed by the cell, are equivalent to siRNA molecules and are used by the cell to mediate RNA silencing of the desired protein. Various constructs of particular utility for RNA silencing in plants are described by Horiguchi, 2004, supra. Typically, such a construct comprises a promoter, a sequence of the target gene to be silenced in the “sense” orientation, a spacer, the antisense of the target gene sequence, and a terminator.

Yet another type of synthetic null mutant can also be created by the technique of “co-suppression” (Vaucheret et al., 1998, Plant J. 16(6): 651-659). Plant cells are transformed with a copy of the endogenous gene targeted for repression. In many cases, this results in the complete repression of the native gene as well as the transgene. In one embodiment, an invertase-encoding gene from the plant species of interest is isolated and used to transform cells of that same species.

Mutant or transgenic plants produced by any of the foregoing methods are also featured in accordance with the present invention. Preferably, the plants are fertile, thereby being useful for breeding purposes. Thus, mutant or plants that exhibit one or more of the aforementioned desirable phenotypes can be used for plant breeding, or directly in agricultural or horticultural applications. They will also be of utility as research tools for the further elucidation of the participation of sucrose metabolizing enzymes and its affects on sucrose levels, thereby affecting the flavor, aroma and other features of coffee seeds. Plants containing one transgene or a specified mutation may also be crossed with plants containing a complementary transgene or genotype in order to produce plants with enhanced or combined phenotypes.

The following examples are provided to describe the invention in greater detail. The examples are for illustrative purposes, and are not intended to limit the invention.

Example 1 Materials and Methods for Subsequent Examples

Plant Material. Tissues from either leaves, flowers, stem, roots, or cherries at different stages of development were harvested from Coffea arabica L. cv. Caturra T2308 grown under greenhouse conditions (25° C., 70% RH) or from Coffea canephora BP409 (robusta) grown in the field at the Indonesian Coffee and Cacao Research Center (ICCRI), Indonesia. The fruit was harvested at defined stages and frozen immediately in liquid nitrogen, and then packaged in dry ice for transport. Cherries from FRT05, FRT64 (Robusta) and CCCA12 (Arabica) were obtained from trees cultivated in Quito, Ecuador. Samples were frozen at −25° C. for transportation, then stored at −80° C. until use.

Universal Genome Walker. Genomic DNA from BP409 was extracted from leaves harvested from greenhouse-grown trees according Crouzillat et al., 1996. Genomic DNA was digested with four different restriction enzymes (DraI, EcoRV, PvuI, StuI) and the resulting fragments were ligated blunt-end to the GenomeWalker Adaptor provided by the Universal GenomeWalker kit (BD Biosciences). Both sets of reactions were carried out in accordance with the kit user manual. The four libraries were then employed as templates in PCR reactions using Gene-Specific Primers (GSP) (Table 1). The reaction mixtures contained 1 μl of GenomeWalker library template, 10 nmol of each dNTP, 50 pmol of each primer and 1 U of DNA polymerase (Takara, Combrex Bio, Belgium) in a final volume of 50 μl with the appropriate buffer from Takara. The following conditions were used for the first PCR: after pre-denaturing at 95° C. for 2 min, the first seven cycles were performed at a denaturing temperature of 95° C. for 30 s, followed by an annealing and elongation step at 72° C. for 3 min. A further 35 cycles were carried out, with the denaturation step at 95° C. for 30 s followed by the annealing/elongation step at 67° C. for 3 min. Products from the first amplification using the primer AP1/GSP-GW1 served as template for the second PCR using AP2/GSP-GWN1, with AP2 and GSP-GWN as primers. The second PCR used 2 μl of the first amplification reaction (undiluted and different dilutions up to 1:50), and was performed as described above for the first reaction, with the exception that the second reaction used only 25 cycles of amplification. The resulting PCR fragments were separated and purified by agarose gel electrophoresis. PCR fragments from the major bands were purified, cloned and sequenced.

TABLE 1 List of primers used for Genome Walker experiments SEQ ID Primers Sequences NO.: AP1 ^(5′)gtaatacgactcactatagggc^(3′) 31 AP2 ^(5′)actatagggcacgcgtggt^(3′) 32 INV1-GW1 ^(5′)gcgatttgacccattctatcaggtacg^(3′) 33 INV1-GWN1 ^(5′)ttgctggttcttagggtctatgccagt^(3′) 34 INV3-GW1 ^(5′)acaatggtggatcttggccagt^(3′) 35 INV3-GWN1 ^(5′)tttgtcagcaggtccacgaggag^(3′) 36 INV3-GW2 ^(5′)acaatggtggatcttggccagt^(3′) 37 INV3-GWN2 ^(5′)tttgtcagcaggtccacgaggag^(3′) 38 INV3-GW3 ^(5′)ggatacaaaaccagtaaagccagaagtgct^(3′) 39 INV3-GWN3 ^(5′)gttgcagaattggattactgggtactg^(3′) 40 INV3-GW4 ^(5′)tccagagtcaactggagcaactcttcca^(3′) 41 INV3-GWN4 ^(5′)atgccagagcacttggcacaaagtctcgt^(3′) 42 INV3-GW5 ^(5′)gagagcttcccaagcatcagcaaccata^(3′) 43 INV3-GWN5 ^(5′)agacaactcgctcagtgatctctcatca^(3′) 44

DNA sequence analysis. For DNA sequencing, recombinant plasmid DNA was prepared and sequenced according to standard methods. Computer analysis was performed using DNA Star (Lasergene) software. Sequence homologies were verified against GenBank databases using BLAST programs (Altschul et al. 1990).

cDNA preparation. RNA was extracted from different tissues, i.e., root, stem, leaves, flowers, pericarp and grain at four different maturation stages SG (small green), LG (large green), Y (yellow), R (red), as described previously (Benamor and Mc Carthy, 2003). cDNA was prepared from total RNA and oligo dT(18) (Sigma) as follows: 1 μg total RNA sample plus 50 ng oligo dT was made up to 12 μl final volume with DEPC-treated water. This mixture was subsequently incubated at 70° C. for 10 min and then rapidly cooled on ice. Next, 4 μl of first strand buffer (5×, Invitrogen), 2 μl of DTT (0.1 M, Invitrogen) and 1 μl of dNTP mix (10 mM each, Invitrogen) were added. These reaction mixes were preincubated at 42° C. for 2 min before adding 1 μl-SuperScript III Rnase H-Reverse transcriptase (200 U/μl, Invitrogen). Subsequently, the tubes were incubated at 42° C. for 50 min, followed by enzyme inactivation by heating at 70° C. for 10 min. The cDNA samples generated were then diluted one hundred fold and 5 μl of the diluted cDNA were used for Q-PCR.

3′ RACE (Rapid Amplification of 3′ cDNA ends) for CcINV1 cDNA isolation. RNA was extracted from pericarp and grain at four different maturation stages SG, LG, Y, R as described previously (Benamor and Mc Carthy, 2003; Benamor et al, report in preparation). Then cDNA was prepared from total RNA using dT₍₁₈₎-Tail (^(5′)cttccgatccctacgctttttttttttttttttt^(3′)) (SEQ ID NO:45) primer as follows: 1 ug total RNA sample plus 50 ng dT₍₁₈₎-Tail primer was made up to 12 μl final with DEPC-treated water. This mixture was subsequently incubated at 70° C. for 10 min and then rapidly cooled on ice. Next, 4 μl of first strand buffer (5×, Invitrogen), 2 μl of DTT (0.1 M, Invitrogen) and 1 μl of dNTP mix (10 mM each, Invitrogen) were added. These reaction mixes were preincubated at 42° C. for 2 min before adding 1 μl-SuperScript III Rnase H-Reverse transcriptase (200 U/μl, Invitrogen). Subsequently, the tubes were incubated at 42° C. for 50 min, followed by enzyme inactivation by heating at 70° C. for 10 min. The cDNA samples generated were used in a PCR reaction with Inv1-3′a1 (^(5′)gacgtgaatggttgctggtcagg^(3′)) (SEQ ID NO:46) and Tail-3′RACE (^(5′)cttccgatccctacgc^(3′)) (SEQ ID NO:47) as primers for the first PCR and Inv1-3′a2 (^(5′)tacagtgggtgctgagctttggt^(3′)) (SEQ ID NO:48) and Tail-3′RACE as primers for the second PCR. The PCR reactions were performed in 50 μl reactions as follows: 5 μL of cDNA; 1×PCR buffer (La PCR Buffer II Mg⁺⁺ plus), 800 nM of the each gene specific primer, 200 μM each dNTP, 0.5 U of DNA polymerase Takara LA Taq (Cambrex Bio Science). After denaturing at 94° C. for 5 min, the amplification consisted of 35 cycles of 1 min at 94° C., 1 min at 55° C. and 2 min at 72° C. An additional final step of elongation was done at 72° C. for 7 min.

Full length INV1 and INV3 cDNA amplification. In order to amplify full length INV1 and INV3 cDNA, two sets of primers: INV1-ATG (^(5′)atggctagcttttacctctggctaatgtg^(3′)) (SEQ ID NO:49), INV1-STOP (^(5′)tcaattctttcgattgatactggcattct^(3′)) (SEQ ID NO:50) and INV3-ATG (^(5′)atggagtgtgttagagaatatcaact^(3′)) (SEQ ID NO:51), INV3-STOP (^(5′)tcagcaggtccacgaggaggatctct^(3′)) (SEQ ID NO:52) have been designed respectively on INV1 or INV3 sequences obtained from the primer walking or 3′RACE experiments. These two primer sets have been used to perform RT-PCR reaction using cDNA samples described above. The PCR reactions were performed in 50 μl reactions as follows: 5 μL of cDNA; 1×PCR buffer (La PCR Buffer II Mg⁺⁺ plus), 800 nM of the each gene specific primer, 200 μM each dNTP, 0.5 U of DNA polymerase Takara LA Taq (Cambrex Bio Science). After denaturing at 94° C. for 5 min, the amplification consisted of 35 cycles of 1 min at 94° C., 1 min at 55° C. and 2 min at 72° C. An additional final step of elongation was done at 72° C. for 7 min. Fragments obtained have been purified from agarose gel, cloned and sequenced.

Quantitative-RT-PCR. TaqMan-PCR was carried out as recommended by the manufacturer (Applied Biosystems, Perkin-Elmer). The cDNA samples used in this experiment have been described earlier. All reactions contained 1× TaqMan buffer (Perkin-Elmer) and 5 mM MgCl₂, 200 μM each of dATP, dCTP, dGTP and dTTP, 5 μl cDNA, and 0.625 units of AmpliTaq Gold polymerase. PCR was carried out using 800 nM of each gene specific primers, forward and reverse, and 200 nM TaqMan probe. Primers and probes were designed using PRIMER EXPRESS software (Applied Biosystems, Table 2). Reaction mixtures were incubated for 2 min at 50° C., 10 min at 95° C., followed by 40 amplification cycles of 15 sec at 95° C./1 min at 60° C. Samples were quantified in the GeneAmp 7500 Sequence Detection System (Applied Biosystems). Transcript levels were determined using rpl39 as a basis of comparison.

TABLE 2 List of primers and probes used for Q-PCR experiment Primers SEQ and ID Protein cDNA probe Sequences NO.: Invertases CcInv1 CcInv1 F2 GTGAATGGTTGCTGGTCAGGAT 53 CcInv1 R2 CAGTGTAGAGAATGGCTGGGTTTT 54 CcInv1 MGB2 AACGACAATGCTTCGAGGG 55 CcInv2 CInv2 F2 AGTTTATCCGACCAAGGCAATC 56 CcInv2 R2 TCACCCCTGTGGCATTGTT 57 CcInv2 MGB2 CAGCGCGACTCTT 58 CcInv3 CcInv3 F1 CTTGCTGAGAGCCGTTTGCT 59 CcInv3 R1 CAATATATCTACCAAGTTTGCCATCATAG 60 CcInv3 MGB1 AGGACAGTTGGCCTGAGT 61 Invertases CcInvI1 CcInvI1 F1 CGCCGTTGAGGCAGTTAGA 62 Inhibitors CcInvI1 R1 TTAGCTCCTTGATGCTTTGCAA 63 CcInvI1 MGB1 ACAAGAACTCA 64 CcInvI2 CcnvI2 F1 AGGTGCATGATCAGACAATTGC 65 CcInvI2 R1 GCACTGCCGGACATAAGGAT 66 CcInvI2 MGB1 AGGGCAAGAAGCTG 67 CcInvI3 CcInvI3 F1 GTTACTGCAAAGCCGCGTTTA 68 CcInvI3 R1 GAAGAAATGCTAAGGTGGCTAGTTTT 69 CcInvI3 MGB1 AGCATGGAGATTGAAGC 70 CcInvI4 CcInvI4 F1 CGATTGCAAGCTGGTGATTATG 71 CcInvI4 R1 TTCAGTTTGAGCTGCTGATGCT 72 CcInvI4 MGB1 AGGCGTGAATATCA 73 rp139 rp139 F1 GAACAGGCCCATCCCTTATTG 74 rp139 R1 CGGCGCTTGGCATTGTA 75 rp139 MGB1 ATGCGCACTGACAACA 76 MGB Probes were labelled at the 5′ with fluorescent reporter dye 6-carboxyfluorescein (FAM) and at the 3′ with quencher dye 6-carboxy-tetrainethyl-rhodamine (TAMRA). rp139 probe was labeled at the 5′ with fluorescent reporter dye VIC and at the 3′ end with quencher TAMRA. All sequences are given 5′ to 3′

Soluble Sugars quantification. Grain tissues were separated from pericarp and hulls. The grains were homogenized in a cryogenic grinder with liquid nitrogen and the powder obtained was lyophilized for 48 hours (Lyolab bII, Secfroid). Each sample was weighed and suspended in 70 ml of double-distilled water previously pre-heated to 70° C., then shaken vigorously and incubated for 30 min at 70° C. After cooling to room temperature, the sample was brought to 100 ml by adding doubled-distilled water, and then paper filtered (Schleicher and Schuell filter paper 597.5). Sugars of extracted coffee grain tissues were separated by HPAE-PED according to Locher et al., 1998 using a Dionex PA 100 (4×250 mm) column. Sugar concentration was expressed in g per 100 g of DW (dry weight).

Enzymatic Activity analysis. Neutral and acid invertase activities were measured according King et al., 1997.

Example 2 Identification of cDNA Encoding Invertase Proteins in C. canephora

More than 47,000 EST sequences were identified from several coffee libraries made with RNA isolated from young leaves and from the grain and pericarp tissues of cherries harvested at different stages of development. Overlapping ESTs were subsequently “clustered” into “unigenes” (i.e., contigs) and the unigene sequences were annotated by doing a BLAST search of each individual sequence against the NCBI non-redundant protein database.

Enzymes directly involved in the synthesis and degradation of sucrose have been widely studied in plants, and especially during fruit, tuber, and seed development in plants such as tomato (Lycopersicon esculentum), potato (Solanum tuberosum) and corn (Zea mays). DNA sequences coding for all known key proteins involved in sucrose synthesis and degradation have been identified and characterized in several species and are available in GenBank. Accordingly, the known sequences of plant enzymes, especially sequences from organisms closely related to coffee (e.g., tomato and potato), were used to find similar sequences present in the above-described EST libraries and in other coffee cDNA libraries. To search the aforementioned EST collection, protein sequences of tomato and potato were used in a tBLASTn search of the “unigene” set 5 as described in Example 1. Those in-silica “unigenes” whose open reading frames showed the highest degree of identity with the “query” sequence were selected for further study. In some cases, the selected “unigenes” contained at least one EST sequence that potentially represented a full length cDNA clone, and that clone was then selected for re-sequencing to confirm both its identity and the “unigene” sequence.

Based on their solubility, subcellular localization, pH-optima and isoelectric point, three different types of invertase isoenzymes can be distinguished: vacuolar (InvV), cell wall bound (InvCW) and neutral (InvN) invertases. InvV and InvCW have similar enzymatic and biochemical properties and share a high degree of overall sequence homology and two conserved amino acid motifs. One common feature is the pentapeptide N-DPN-G/A (SEQ ID NO:77) (β-Fructofuranosidase-motif; Sturm and Chrispeels, 1990; Roitsch and Gonzalez, 2004). The second conserved feature is the highly conserved cysteine sequence WECX(P/V)DF (SEQ ID NO:78) (Sturm and Chrispeels, 1990) in which V and P distinguish the Vacuolar and cell-wall (Periplasmic) invertase respectively.

To find cDNA encoding the three invertase isoenzymes in coffee, protein sequences corresponding to (1) the tomato vacuolar invertase TIV-1, (2) the tomato cell wall invertase LIN6, and (3) the A. thaliana neutral (cytoplasmic) invertase-like protein have been used to perform a similarity search of the unigene set using the tBLASTn algorithm.

A. CcInv2 (SEQ ID NO: 10)

The ORF of unigene #127336 was found to have a high degree of homology with the tomato vacuolar invertase TIV-1 (NCBI Protein Identifier No. P29000; Klann et al., 1992). The single EST in this unigene, clone cccl20fl 1, was isolated and its insert fully sequenced. The cDNA insert was found to be 2212 bp long. The complete ORF sequence of this clone was 1761 bp long, starting at position 192 and finishing at position 1952. The deduced protein was 586 aa long with a predicted molecular weight of 64 kDa. The protein encoded by cccl20fl 1 has been annotated CcInv2 (Coffea canephora Invertase 2). CcInv2 is 69.6% identical to the tomato vacuolar invertase TIV-1 and 68.5% identical to an invertase characterized in potatoSTVInv (FIG. 2). Marraccini et al. have recently placed a partial cDNA sequence from Coffea arabica potentially encoding a vacuolar invertase in the public databases (NCBI Nucleotide Identifier No. AJ575258. They have called this partial protein sequence Inv2 (NCBI Protein Identifier No. CAE01318). Partial alignment between CcInv2 and inv2 has shown 93.8% of identity (FIG. 2). The proposed vacuolar localization of this robusta invertase is supported by the presence of a V in the highly conserved WECVDF domain (FIG. 2, Sturm and Chrispeels, 1990) whereas inv2 protein sequence is characterized by the presence of a P in this domain suggesting that inv2 may be a cell wall bound invertase. The alignment in FIG. 2 shows that the N-terminal region of CcInv2 is shorter than those seen for two homologues from other plants. However, the cDNA insert of cccl20fl 1 actually starts 190 bp beyond the first amino acid shown for CcInv2 in FIG. 2. This 190 bp sequence has two open reading frames, but neither are in-frame with the major ORF. In addition, the amino acid sequences of the short ORFs do not correspond to sequences seen in the other two homologous sequences (FIG. 2). These results could be explained by either the N-terminal region of this Coffea canephora protein being shorter than the comparable region in homologous proteins of other plants, or the presence of an intron in this region of the cDNA clone.

B. CcInv3 (SEQ ID NO: 11)

The protein encoded by the clone cccp28p22 (unigene #96095) has a high homology to the neutral cytoplasmic invertase from A. thaliana (Protein Identifier No. NP_(—)567347). The protein encoded by cccp28p22 clone has been annotated CcInv3 (Coffea canephora Invertase 3). According to the optimal alignment obtained, the cDNA insert of cccp28p22 is not full-length, i.e., it does not code for the entire protein (approximately 1500 bases are missing). Using several rounds of primer directed genome walking, we have been able to amplify the genomic sequence from C. canephora corresponding to the 5′ region upstream cccp28p22 sequence. Using specific primers, we have amplified the full length cDNA by RT-PCR. Several RNA samples from C. arabica and C. canephora were used, positive amplification corresponding to the full length cDNA sequence was only obtained using RNA extracted from arabica grain at yellow stage. The protein encoded by this new cDNA sequence has been annotated CaInv3 (Coffea arabica Invertase 3). The CaInv3 cDNA is 1675 bp long. The deduced protein is 558 as long, with a predicted molecular weight of 63.8 kDa. The protein sequence encoded by the CaInv3 cDNA shows a very high level of homology (83.7%) with the neutral cytoplasmic invertase from A. thaliana (FIG. 3).

C. CcInv4 (SEQ ID NO: 12)

The protein encoded by the clone cccs46w27d20 (unigene #123705) has a significant degree of identity (62.7%) with the tomato cell wall bound invertase LIN6 (NCBI Protein Identifier No. AAM28823). The alignment is shown in FIG. 4. According to the optimal alignment obtained, the cDNA insert of cccs46w27d20 is not full-length i.e. it does not code for the entire protein (approximately 1500 bases are missing). It is important to note that the protein encoded by cccs46w27d20 shares also 38% of identity with the tomato vacuolar invertase TIV-1 (Klann et al., 1992). The protein encoded by cccs46w27d20 clone has been annotated CcInv4 (Coffea canephora Invertase 4). This protein shares higher homology with vacuolar invertase than cell wall bound invertase. Genome Walker and 5′ RACE have been carried out to isolate 5′ end missing region.

Based on the data presented above, we have isolated one cDNA encoding each type of invertase isoenzyme from the C. canephora database.

D. CcInv1 (SEQ ID NO: 9)

A homologous full length cDNA sequence from C. canephora (robusta) was isolated using a partial cDNA sequence encoding a cell wall invertase from Coffea arabica (made available by Marraccini et al.: NCBI Nucleotide Identifier No. AJ575257, and the encoded partial protein sequence (Inv1) NCBI Protein Identifier No. CAE1317.1). Using the partial cDNA sequence and the 3′RACE, as well as “primer assisted” genome walking experiments, as described in Example 1, the homologous full length cDNA was found to be 1731 bp long and the deduced protein was 576 aa long with a predicted molecular weight of 64.6 kDa. This protein has been annotated CcInv1 (Coffea canephora Invertase 1).

The protein sequence obtained for CcInv1 is not identical to the sequence obtained by Marraccini et al., having 4 amino acid differences over the 163 amino acids known for the partial arabica cDNA sequence. An alignment of CcInv1 with several highly homologous database sequences shows that CcInv1 has 55.2% identity with the tomato cell wall bound LIN6 and 54.3% identity with DCCW Inv (FIG. 5), a cell wall bound invertase identified in carrot. The proposed cellular localization of CcInv1 is supported by the presence of a P in the highly conserved WECPDF domain (FIG. 5, Sturm and Chrispeels, 1990).

Example 3 Identification of cDNA Encoding Invertase Inhibitor Proteins in C. canephora

Recent publications this past decade have shown that activity of invertases can be regulated at the post-translational level by interaction with a group of small molecular weight proteins (<20 kDa) called invertase inhibitors (Greiner et al., 1998; Greiner et al., 2000; Helentjaris et al., 2001; Bate et al., 2004). Many sequences from several plant species have been identified in the public databases but few of them are characterized biochemically. Recently, two invertase inhibitors, NtINVINH1 from tobacco (Protein Identifier No. CAA73333; Greiner et al., 1998) and ZM-INVINH1 from maize (Nucleotide Identification No. AX214333; Bate et al., 2004 corresponding to protein ID. 1 in Helentjaris et al., 2001) have been biochemically characterized. For example, ZM-INVINH1 has been shown to directly control sucrose metabolism by its capability to act as a sucrose sensor (Bate et al., 2004). In the presence of high sucrose concentrations, the invertase inhibitor ZM-INVINH1 remains inactive, allowing sucrose hydrolysis during early fruit development. When the sucrose levels fall below a specific level, this invertase inhibitor then becomes active and inhibits the invertase activity (Helentjaris et al., 2001; Bate et al., 2004).

Invertase inhibitor sequences from many different organisms (tomato, tobacco, maize and A. thaliana) are available in GenBank, but most of them have been annotated based simply on homology results obtained using BLAST and not by the direct characterization of their biochemical activity. It is noted that the relatively small number of invertase inhibitors that have been characterized biochemically generally show weak homologies to one another (Bate et al., 2004), and to date, this class of protein has no defined highly conserved sequence motifs (Bate et al., 2004). Therefore, database entries annotated as “invertase inhibitors” or “invertase inhibitor-like protein” must be interpreted with caution. To perform the blast search in the coffee databases for coffee invertases, we used sequences encoding for the biochemically characterized invertase inhibitors ZM-INVINH1, NtInvI and protein ID. 31 in Helentjaris et al., 2001 (Protein Identifier No. CAC69345).

Based on this search, four clones cccp2d1 (unigene #124209), cccs30w14i24 (unigene #125332), cccs30w24n8 (unigene #122705) and A5-1462 with similarity to database invertase inhibitors have been identified in the EST databases.

A. CcInvI1 (SEQ ID NO: 13)

The 670 bp cDNA insert of cccp2d1 clone is apparently full length, with a complete ORF sequence of 558 bp, encoding a protein with a potential molecular weight of 20.7 kDa. The protein sequence of cccp2d1 is 31.2% identical to the invertase inhibitor ZM-INVINH1 characterized in corn (Bate et al., 2004) (FIG. 6). This cDNA has been annotated CcInvI1 (Coffea Canephora Invertase Inhibitor 1).

B. CcInvI2 (SEQ ID NO: 14)

The 629 bp cDNA insert of cccs30w14i24 clone is apparently full length, with a complete ORF sequence of 537 bp, encoding for a protein with a potential molecular weight of 19.6 kDa. The protein sequence of cccs30w14i24 is 34.6% identical to the invertase inhibitor NtInvI characterized in tobacco (Greiner et al., 1998; Weil et al., 1994) (FIG. 6). This cDNA has been annotated CcInvI2 (Coffea Canephora Invertase Inhibitor 2).

C. CcInvI3 (SEQ ID NO: 15)

Blast screening of the cDNA library described in PCT application PTC/EP2004/006805 resulted in the discovery of the cDNA clone A5-1462. The 704 bp cDNA insert of A5-1462 clone is apparently full length, with a complete ORF sequence of 495 bp, encoding for a protein with a potential molecular weight of 18.4 kDa. The protein sequence of A5-1462 is only 13% identical to ZM-INVINH1 (FIG. 6) but 24.4% identical to the protein ID. 31 (Nucleotide Identification No. AX214363; Helentjaris et al., 2001). This cDNA has been annotated CcInvI3 (Coffea Canephora Invertase Inhibitor 3).

D. CcInvI4 (SEQ ID NO: 16)

The 640 bp cDNA insert of cccs30w24n8 clone is apparently full length, with a complete ORF sequence of 555 bp, encoding for a protein with a potential molecular weight of 20.2 kDa. The protein sequence of cccs30w24n8 is 20.5% identical to ZM-INVINH1 (FIG. 6) and 25.7% identical to the protein ID. 31 (Nucleotide Identification No. AX214363; Helentjaris et al., 2001). This cDNA has been annotated CcInvI4 (Coffea Canephora Invertase Inhibitor 4).

As noted earlier, CcInvI proteins are not well conserved, and share weak homology with ZM-INVINH1 or NtInvI for example. The four “conserved” Cys residues known to be essential for function (Rausch and Greiner, 2003; Scognamiglio et al., 2003; Hothorn et al., 2003; Hothorn et al., 2004) are present in each protein (FIG. 6).

Example 4 Acid and Neutral Invertase Activities During Coffee Bean Maturation

Concentrations of glucose, fructose and sucrose have been determined in whole grains from FRT05 (robusta) and CCCA12 (arabica) during coffee grain maturation. We have chosen to analyze these two genotypes because they have been previously found to have significantly different levels of sucrose (Charles Lambot, unpublished data). In order to understand the basis for this difference, we analyzed the accumulation of sucrose during grain development of these two varieties, as well as the levels of glucose and fructose. In parallel, acid and neutral invertase activities were examined in order to determine if there might be a correlation between free sugar accumulation and these particular activities. Similar experiments have been carried out using samples from a second robusta variety, FRT64. The results are shown in Table 3 and FIG. 7.

TABLE 3 Acid and neutral invertase activities during coffee bean maturation. Development Su- Glu- Fruc- Acid Neutral Genotype stage crose cose tose Invertase Invertase FRT05 SG 0.72 1.54 0.33 1.50 0.43 LG 1.45 1.71 0.09 0.58 0.17 Y 3.13 0.09 0 0.26 0.3 R 6.70 0.04 0.09 1.44 0.54 FRT64 SG 1.79 2.82 0.40 0.21 0.15 LG 1.94 2.48 0.27 0.19 0.12 Y 4.46 0.04 0 0.45 0.28 R 6.6 0.07 0.16 0.58 0.51 CCCA12 SG 2.65 14.41 1.52 0.17 0.09 LG 3.11 5.62 0.49 1.70 0.49 Y 8.04 0.1 0.12 0.19 0.20 R 9.83 0.08 0.1 0.34 0.14 Coffee cherries at four different maturation stages characterized by size and color have been used for this study i.e. SG (small green), LG (large green), Y (yellow) and R (red). Concentrations of sucrose, glucose and fructose in the coffee grain were measured in samples harvested in parallel to those used for the assays of invertase activity. Sugar concentration is expressed in g/100 g DW (dry weight) while enzymatic activities are expressed in μmoles · h⁻¹ · mg⁻¹ proteins.

A. Sugar Levels During Coffee Grain Maturation

At the earliest stage of maturity examined (stage SG), the main free sugar was glucose but the concentration was 10 times higher in CCCA12 (14%) than FRT05 (1.5%). At the same stage, fructose concentration was also higher in arabica (1.5%) than FRT05 (0.3%) but clearly fructose was less accumulated than glucose. By the end of grain development, concentrations of glucose and fructose had decreased to very low levels for both species with only traces being detected at the mature red stage (R). The decrease in these two sugars was accompanied by an increase in sucrose, which approached 100% of total free sugars in mature grains, again being higher in arabica (9.82%) than robusta (6.71%). The same global remarks can be made on sucrose, glucose and fructose variations during FRT64 coffee bean maturation. Glucose was more accumulated in earliest stage than fructose. At the end of development, sucrose was the major sugar accumulated. Interestingly, even if FRT64 and FRT05 have same final sucrose concentration in R stage (around 6.6% of DW), sucrose was more accumulated in FRT64 than FRT05 samples at all previous stages i.e. SG (60% more), LG (25% more) and Y (30% more). It is important to note that these results represent only free sugar accumulation and do not include their modified form like i.e. UDP-G, F6-P and S6-P that are also directly involved in sucrose metabolism.

B. Invertase Activity (Acid and Neutral) During Coffee Grain Maturation

Acid and neutral enzyme activities evolved similarly during CCCA12 coffee grain maturation. Low acid (0.17 U) and neutral (0.09 U) invertase activities were observed in SG stage of CCCA12. Both enzymatic activities rose drastically between SG and LG stage and reached an activity of 1.70 U for acid invertase and 0.49 U for neutral invertase. In the later stage of development, AI and NI activity declined dramatically to reach approximately similar low levels of activity at the Y stage (0.19 and 0.20 U respectively). Between Y and R stages, while AI activity increased up to 0.34 U, NI activity decreased to 0.14 U. Interestingly, AI and NI activities have similar variations than SuSy activity previously observed for the same samples (See commonly owned, co-pending provisional application No: [NOT YET ASSIGNED]). There is a clear correlation with diminution of both invertase activities and sucrose accumulation in latest stages of CCCA12 grain maturation.

Notably, AI and NI activities evolved in very different fashion for FRT05 and FRT64 versus those observed for CCCA12. AI (1.50 U) and NI (0.43 U) enzymes were highly active early in FRT05 development (stage SG). AI activity decreased drastically between SG and Y stages to reach 0.26 U (almost the same activity than what is observed for CCCA12 at Y stage). AI activity in FRT05 rose up between Y and R stage to reach 1.44 U. Decreased activity of neutral invertase was also observed but only between SG and LG stages. Increased activity of neutral invertase was observed between LG and R stage, NI reached its maximum activity 0.54 U. FRT05 late grain development stage is characterized by high AI and NI activity. For FRT64 genotype, AI activity and NI activities were low in SG FRT64 grain. Both activities stayed stable between SG and LG stages and increased between LG and R stages, increase being higher for the NI than AI. FRT64 had same neutral invertase activity augmentation between LG and R stages than FRT05 but in parallel acid invertase activity is 2.5 higher in FRT05 than FRT64 R stage. In conclusion, FRT05 and FRT64 have same final sucrose concentration in mature grain but invertase mainly acid activity was drastically different.

Overall, it appears that CCCA12 may accumulate more sucrose than FRT05 and FRT64 in part because of weaker global invertase activity at the final stage of maturation. Even if sucrose is synthesized after importation from phloem, invertase activity is preventing in late development sucrose accumulation by immediate degradation in both robustas.

Example 5 Invertase and Invertase Inhibitors mRNA Accumulation During Coffee Bean Maturation

The expression of the invertase genes CcInv1, CcInv2 and CcInv3 as well as invertase inhibitors genes CcInvI1, 2, 3 and 4 during T2308 (C. arabica) and BP409 (C. canephora) grain development was characterized. For comparative purposes, we also characterized the expression of these genes in different coffee tissues such as leaf, flower and root. It is noted that these gene expression studies relate to different varieties from those used in the enzyme activity analysis experiments. Nevertheless, this expression data does allow an overall comparison between the expression of these genes in arabica versus robusta.

RNA had been extracted from BP409 and T2308 coffee cherries at four different maturation stages characterized by size and color, i.e. SG (small green), LG (large green), Y (yellow) and R (red or mature). For each stage, the pericarp and grain were separated before total RNA was extracted as described in Example 1. Total RNA was also extracted from other tissues (leaf, root and flower). Gene expression was analyzed by performing real time RT-PCR (TaqMan, Applied Biosystems). Relative transcript levels were quantified against an endogenous constitutive transcript rpl39. The gene specific primers and the TaqMan probes used are listed in Table 2 above.

The first general observation regarding CcInv gene expression is that these genes were found to be poorly expressed, especially in grain, at all maturation stage and for both genotypes (FIG. 8). CcInv1 transcripts (a cell wall invertase) were not detected in grain of either genotype. Interestingly, transcripts for CcInv1 were not detected in T2308 pericarp, while significant levels could be detected in the pericarp of BP 409 at the same stages. Conversely, relatively significant levels of CcInv1 were detected in the roots and leaf tissues of BP409 but not in the same tissues from 12308. This inverse expression strongly suggests that these differences are not due to allelic differences in the BP409 and 12308 genes encoding these transcripts, but are apparently due to differences in the transcript levels of these genes in each genotype. A very high level of CcInv2 expression was detected in the flowers of T7308 relative to the expression in BP409 (FIG. 8, panel A; approximately 10 fold difference, 12308 (4.8 RQ) versus BP409 (0.38 RQ)).

It has been noted previously that there are significant differences in the expression of several other genes in the whole flowers samples of 12308 and BP 409 used here (for example CcHQT, CcPAL1 and CcPAL3, unpublished data), which has led to the idea that these whole flower samples may not be precisely at the same developmental stage. When the expression data for CcInv2 was investigated in more detail (FIG. 8, panel B) it was seen that, apart from the small green grain of robusta, CcInv2 was expressed at very low levels in the grain of arabica or robusta. It is noted however, that there appears to be a slight tendency for the weak expression of CcInv2 in the grain to increase towards maturity. A relatively significant expression of CcInv2 was detected in the arabica and robusta pericarp tissues, although the pattern of this expression was different.

In all the arabica pericarp stages tested, there was relatively similar expression; while in robusta, expression of CcInv2 was very low in the small green pericarp and then increased gradually, with the highest expression being detected in the mature pericarp tissue. Low CcInv2 expression was also detected in the roots and leaf of BP 409, but not in T2308.

The highest expression of CcInv3, which is believed to encode a neutral (cytoplasmic) invertase, was found in the flowers of arabica and robusta. Much lower levels of CcInv3 expression were detected in the other tissues. In all stages in the grain, the level of CcInv3 transcripts appeared to be marginally higher in arabica than robusta, while in the pericarp, the opposite appeared to be the case, with expression in robusta being marginally higher at the large green to red stages than in arabica.

While the control of invertases at the transcriptional level is important, significant control can also be exerted at the post-transcriptional level by the interaction of invertase proteins with a group of small molecular weight proteins (<20 kDa) called invertase inhibitors (Greiner et al., 1998; Greiner et al., 2000; Helentjaris et al., 2001; Bate et al., 2004). As noted above, four full-length cDNAs believed to encode invertase inhibitors were isolated from the EST libraries. The results of the expression analysis of these genes are presented in FIG. 9.

In arabica, CcInv/1 was found to be exclusively expressed in the grain at the small green stage and to a much lesser extent in the large green stage, while in robusta this gene was expressed primarily in the large green grain (FIG. 9). Very low levels of CcInvI 1 expression were detected in both arabica and robusta yellow grain, but not in mature grain (red).

Less specificity was seen for the expression of CcInvI 2 (FIG. 9). This gene is expressed at a relatively high level in whole flowers of both arabica and robusta. In arabica and robusta pericarp, CcInvI 2 expression can be detected at relatively low level during the small green stage, but it clearly increases significantly in both species as the cherries mature. CcInvI 2 appears to be expressed at extremely low levels at all stages in the grain, as well as in roots and leaves.

Like CcInvI 1, the expression of CcInvI 3 and CcInvI 4 showed a high level of tissue specificity. CcInv 3 appears to be exclusively expressed in the small green grain of arabica and in the yellow grain of robusta. CcInvI 4 expression was detected almost exclusively in the small green tissue of arabica grain, while in robusta, it was expressed in the large green grain as well as to a lesser extent in the leaves.

REFERENCES

-   Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z.,     Miller W. and Lipman D. 1990. Gapped BLAST and PSI-Blast: a new     generation of protein database search. Nucleic Acids Res. 25:     3389-3402. -   Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence     of the flowering plant Arabidopsis thaliana. Nature. 408: 796-815. -   Badoud R., 2000. “What do we know about coffee chemistry, flavour     formation and stability ? Internal Note, 23 Oct. 2000. -   Bate N. J., Niu X., Wang Y., Reimann K. S. and     Helentjaris T. G. 2004. An invertase inhibitor from maize localizes     to the embryo surrounding region during early kernel development.     Plant Physiol. 134.1-9. -   BenAmor M. and Mc Carthy J. 2003. Modulation of coffee flavour     precursor levels in green coffee grains. European patent Application     No. 03394056.0 NESTEC S. A. -   Chahan Y., Jordon A., Badoud R. and Lindinger W. 2002. From the     green bean to the cup of coffee: investing coffee roasting by     on-line monitoring of volatiles. Eur Food Res Technol. 214:92-104. -   Cheng W.-H., Taliercio E. W. and Chourey P. S. 1996. The Miniature 1     seed locus of maize encodes a cell wall invertase required for     normal development of endosperm and maternal cells in the pedicel.     Plant Cell. 8:971-983. -   Clough, S. J. and Bent A. F. 1998. Floral dip: a simplified method     for Agrobacterium-mediated transformation of Arabidopsis thaliana.     Plant Journal 16; 735-743. -   Crouzillat D., Lerceteau E., Petiard V., Morera J., Rodriguez H.,     Walker D., Philips W. R. R., Schnell J., Osei J. and Fritz P. 1996.     Theobroma cacao L.: a genetic linkage map and quantitative trait     loci analysis. Theor Appl Genet. 93: 205-214. -   Dali N., Michaud D. and Yelle S. 1992. Evidence for the involvement     of sucrose phosphate synthase in the pathway of sugar accumulation     in sucrose-accumulating tomato fruits. Plant Physiol. 99:434-438. -   Dickinson C. D., Atabella T. and Chrispeels M. J. 1991. Slow growth     phenotype of transgenic tomato expressing apoplastic invertase.     Plant Physiol. 95:51-57. -   Fridman E. and Zamir D. 2003. Functional divergence of a synthetic     invertase gene family in tomato, potato and Arabidopsis. Plant     Physiol. 131: 603-609. -   Fridman E, Carrari F, Liu Y S, Fernie A R, Zamir D. 2004. Zooming in     on a quantitative trait for tomato yield using interspecific     introgressions. Science. 305(5691): 1786-9. -   Godt, D. E. et T. Roitsch. 1997. Regulation and tissue-specific     distribution of mRNAs for three extracellular invertase isoenzymes     of tomato suggests an important function in establishing and     maintaining sink metabolism. Plant Physiol 115:273-282. -   Grandillo S. and Tanksley S. D. 1996. QTL analysis of horticultural     traits differentiating the cultivated tomato fruit from the closely     related species L. pimpinellifolium. Theor Appl Gene. 92: 935-951. -   Greiner, S. Krausgrill S., and Rausch, T. 1998. Cloning of a tobacco     apoplasmic invertase inhibitor. Proof of function of the recombinant     protein and expression analysis during plant development. Plant     Physiol. 1116: 733-742. -   Greiner S., Rausch T., Sonnewald U. and Herbers K. 1999. Ectopic     expression of a tobacco invertase inhibitor homolog prevents     cold-induced sweetening of potato tubers. Nature Biotech. 17:     708-711 -   Greiner S. Köster Lauer K, Rosenkranz H, Vogel R, Rausch T. 2000.     Plant invertase inhibitors: expression in cell culture and during     plant development. Australian Journal of Plant Physiology, 27:     807-814. -   Helentjaris, T., Bate, N. J. and Allen, S. M. 2001. Novel invertase     inhibitors and methods of use. Patent: WO 0158939. PIONEER HI-BRED     INTERNATIONAL, INC. (US); E.I. DU PONT DE NEMOURS AND COMPANY (US) -   Holscher, W. and Steinhart, H. 1995. Development in Food Science     V37A Food Flavors: Generation, Analysis and Process Influence.     Elsevier, 785-803. -   Hothorn M., Bonneau F., Stier G., Greiner S. and Scheffzek K. 2003.     Bacterial expression, purification and preliminary X-ray     crystallographic characterization of the invertase inhibitor Nt-CIF     from tobacco. Acta Cryst D59:2279-2282. -   Hothorn M., Wolf S., Aloy P., Greiner S, and Scheffzek K. 2004.     Structural insights into the target specificity of plant invertase     and pectin methylesterase inhibitory proteins. Plant Cell.     16:3437-3447. -   Illy, A. and Viani, R. 1995. Espresso Coffee: The Chemistry of     Quality. Academic Press. London Academic Press Ltd. -   King S. P., Lunn, J. E. and Furbank R. T. 1997. Carbohydrate content     and enzyme metabolism in developing canola siliques. Plant Physiol.     114: 153-160. -   Klann E., Yelle S. and Bennett A. B. 1992. Tomato acid invertase     complementary DNA. Plant Physiol. 99: 351-353. -   Klann E. M., Chetelat R. T. and Bennett A. B. 1993. Expression of     acid invertase gene controls sugar composition in tomato     (Lycopersicon) fruit. Plant Physiol. 103: 863-870. -   Klann E. M., Hall B., and Bennett A. B. 1996. Antisense acid     invertase (TIV1) gene alters soluble sugar composition and size in     transgenic tomato fruit. Plant Physiol. 112: 1321-1330. -   Leloup V., Gancel C., Rytz, A. and Pithon, A. 2003. Precursors of     Arabica character in green coffee, chemical and sensory studies. R&D     Report RDOR-RD030009. -   Lowe J. and Nelson O. E., Jr. 1946. Miniature seed—A study in the     development of a defective caryopsis in maize. Genetics. 31:     525-533. -   Marraccini P., Deshayes A., Pétiard V. and Rogers W. J. 1999.     Molecular cloning of the complete 11S seed storage protein gene of     Coffea arabica and promoter analysis in the transgenic tobacco     plants. Plant Physiol. Biochem. 37:273-282. -   Marraccini P, Courjault C, Caillet V, Lausanne F, LePage B, Rogers     W, Tessereau S, and Deshayes A. 2003. Rubisco small subunit of     Coffea arabica: cDNA sequence, gene cloning and promoter analysis in     transgenic tobacco plants. Plant Physiol. Biochem. 41:17-25. -   Miller M. E. and Chourey P. S. 1992. The maize invertase-deficient     miniature-1 seed mutation is associated with aberrant pedicel and     endosperm development. Plant Cell. 4: 297-305. -   Miron D. and Schaffer A. A. 1991. Sucrose phosphate synthase,     sucrose synthase and invertase activities in developing fruit of     Lycopersicon hirsutum Humb. And Bonpl. Plant Physiol. 95: 623-627. -   N'tchobo H., Dali N., Nguyen-Quoc B., Foyer C. H. and Yelle S. 1999.     Starch synthesis in tomato remains constant throughout fruit     development and is dependent on sucrose supply and sucrose     activity. J. Exp. Bot. 50. 1457-1463. -   Nguyen-Quoc, B. and C. H. Foyer. 2001. A role for ‘futile cycles’     involving invertase and sucrose synthase in sucrose metabolism of     tomato fruit. J. Exp. Bot. 52:881-889. -   Ohyama A., Ito H., Sato T., Nishimura S., Imai T. and Hirai M. 1995.     Suppression of acid invertase activity by antisense RNA modifies the     sugar composition of tomato fruit. Plant Cell Physiol. 36: 369-376. -   Privat I., Eychenne M., Kandalaft L., Caillet C., Lin C.,     Tanksley S. and James McCarthy. 2005. Molecular characterization of     sucrose synthase CcSS2 and sucrose phosphate synthase CcSPS1 genes:     quantitative expression and enzymatic activity in low and high     sucrose coffee varieties. Internal Report. -   Rausch T. and Greiner S. 2004. Plant protein inhibitors of     invertases. Biochimica et Biophysica Acta. 1696: 253-261. -   Robinson N. L., Hewitt J. D. and Bennett A. B. 1998. Sink metabolism     in tomato fruit. Plant Physiol. 87:732-730. -   Rogers W. J., Michaux S., Bastin M. and P. Bucheli. 1999. Changes to     the content of sugars, sugar alcohols, myo-inositol, carboxylic     acids and inorganic anions in developing grains from different     varieties of Robusta (Coffea canephora) and Arabica (C. arabica)     coffees. Plant Sc. 149:115-123. -   Roitsch T. and Gonzalez M-C. 2004. Function and regulation of plant     invertases: sweet sensations. Trends in Plant Science. 9 (12):     606-613. -   Russwurm, H. 1969. Fractionation and analysis of aroma precursors in     green coffee, ASIC 4: 103-107. -   Scholes J., Bundock N., Wilde R. and Rolfe S. 1996. The impact of     reduced vacuolar invertase activity on the photosynthetic and     carbohydrate metabolism of tomato. Planta. 200: 265-272. -   Scognamiglio M. A., Ciardiello M. A., Tamburrini M., Carratore V.,     Rausch T. and Camardella L. 2003. The plant invertase inhibitor     shares structural properties and disulfide bridges arrangement with     the pectin methylesterase inhibitor. Journal of Protein Chemistry.     22 (3):363-369. -   Sturm A. Chrispeels M. J. 1990. cDNA cloning of carrot extracellular     β-fructosidase and its expression in response to wounding and     bacterial infection. Plant Cell 2: 1107-1119. -   Sun J., Loboda T., Sung S. J. S, and Black, C. C. J. 1992. Sucrose     synthase in wild tomato, Lycopersicon chmielewskii, and tomato fruit     sink strength. Plant Physiol. 98: 1163-1169. -   Tang G. Q., Luscher M. and Sturm A. 1999. Antisense repression of     vacuolar and cell wall invertase in transgenic carrot alters early     plant development and sucrose partitioning. Plant Cell. 11: 177-189. -   Tanksley S. D., Grandillo T. M., Fulton T. M., Zamir D., Eshed Y.,     Petirad V., Lopez J. and Beck-Bunn T. 1996. Advanced backcross QTL     analysis in a cross between an elite processing line of tomato and     its wild relative L. pimpinellifolium. Theor Appl Gene. 92:213-224. -   von Schaewen A., Stitt M., Schmidt R., Sonnewald U. and     Willmitzer L. 1990. Expression of yeast-derived invertase in the     cell wall of tobacco and Arabidopsis plants leads to accumulation of     carbohydrate and inhibition of photosynthesis and strongly     influences growth and phenotype of transgenic tobacco plants.     EMBO J. 9: 3033-3044. -   Wang F., Smith A. G. and Brenner M. L. 1993. Sucrose synthase starch     accumulation and tomato fruit sink strength. Plant Physiol     101:321-327. -   Weil M., Krausgrill S., Schuster A. and Rausch T. 1994. A 17 kDa     Nicotiana tabacum cell-wall peptide acts as an in vitro inhibitor of     the cell-wall isoform of acid invertase. Planta. 193: 438-445. -   Yau Y-Y and Simon P. W. 2003. A 2.5 kb insert eliminates acid     soluble invertase isozyme II transcript in carrot (Daucus carota L.)     roots, causing high sucrose accumulation. Plant Mol Biol. 53:     151-162. -   Yelle S., Chetelat R. T., Dorais M., Deverna J. W. and     Bennett A. 1991. Sink metabolism in tomato fruit. Genetic and     biochemical analysis of sucrose accumulation. Plant Physiol. 95:     1026-1036. -   Ziegler H. 1975. Nature of transported substances. Encyclopedia of     Plant Physiology. 25: 505-509: -   Zrenner, R., Salanoubat, M., Willmitzer, L., and Sonnewald, U. 1995.     Evidence of crucial role of sucrose synthase for sink strength using     transgenic potato plants (Solanum tuberosum L.). Plant J. 7:97-107.

The present invention is not limited to the embodiments described and exemplified above, but is capable of variation and modification within the scope of the appended claims. 

1. A nucleic acid molecule isolated from coffee (Coffea spp.) comprising a coding sequence that encodes an invertase or an invertase inhibitor.
 2. The nucleic acid molecule of claim 1, wherein the coding sequence encodes an invertase.
 3. The nucleic acid molecule of claim 2, wherein the invertase is a cell wall invertase, a vacuolar invertase or a neutral invertase.
 4. The nucleic acid molecule of claim 3, wherein the invertase is a cell wall invertase and comprises a conserved domain having amino acid sequence WECPDF.
 5. The nucleic acid molecule of claim 4, wherein the invertase comprises an amino acid sequence greater than 55% identical to SEQ ID NO:9 or SEQ ID NO:13.
 6. The nucleic acid molecule of claim 5, wherein the invertase comprises SEQ ID NO:9 or SEQ ID NO:13.
 7. The nucleic acid molecule of claim 6, comprising SEQ ID NO:1 or SEQ ID NO:4.
 8. The nucleic acid molecule of claim 3, wherein the invertase is a vacuolar invertase and comprises a conserved domain having amino acid sequence WECVDF.
 9. The nucleic acid molecule of claim 8, wherein the invertase comprises an amino acid sequence 70% or more identical to SEQ ID NO:10.
 10. The nucleic acid molecule of claim 9, wherein the invertase comprises SEQ ID NO:10.
 11. (canceled)
 12. The nucleic acid molecule of claim 3, wherein the invertase is a neutral invertase.
 13. The nucleic acid molecule of claim 13, wherein the invertase comprises an amino acid sequence 84% or more identical to SEQ ID NO:11.
 14. The nucleic acid molecule of claim 13, wherein the invertase comprises SEQ ID NO:11.
 15. (canceled)
 16. The nucleic acid molecule of claim 1, wherein the coding sequence encodes an invertase inhibitor.
 17. The nucleic acid molecule of claim 16, wherein the invertase inhibitor comprises four conserved cysteine residues in its amino acid sequence.
 18. The nucleic acid molecule of claim 17, wherein the invertase inhibitor comprises an amino acid sequence that is 25% or more identical to any one of SEQ ID NOS: 13, 14, 15 or
 16. 19. The nucleic acid molecule of claim 18, wherein the invertase inhibitor comprises any one of SEQ ID NOS: 13, 14, 15 or
 16. 20. (canceled)
 21. (canceled)
 22. (canceled)
 23. (canceled)
 24. (canceled)
 25. A coding sequence of the nucleic acid molecule of claim 1, contained within a vector.
 26. The vector of claim 25, which is an expression vector selected from the group of vectors consisting of plasmid, phagemid, cosmid, baculovirus, bacmid, bacterial, yeast and viral vectors.
 27. The vector of claim 25, wherein the coding sequence of the nucleic acid molecule is operably linked to a constitutive promoter, or an inducible promoter, or a tissue-specific promoter.
 28. (canceled)
 29. (canceled)
 30. The vector of claim 27, wherein the tissue specific promoter is a seed specific promoter.
 31. The vector of claim 30, wherein the seed specific promoter is a coffee seed specific promoter.
 32. A host cell transformed with the vector of claim
 25. 33. The host cell of claim 32, selected from the group consisting of plant cells, bacterial cells, fungal cells, insect cells and mammalian cells.
 34. The host cell of claim 32, which is a plant cell selected from the group of plants consisting of coffee, tobacco, Arabidopsis, maize, wheat, rice, soybean barley, rye, oats, sorghum, alfalfa, clover, canola, safflower, sunflower, peanut, cacao, tomatillo, potato, pepper, eggplant, sugar beet, carrot, cucumber, lettuce, pea, aster, begonia, chrysanthemum, delphinium, zinnia, and turfgrasses.
 35. A fertile plant produced from the plant cell of claim
 34. 36. A method of modulating flavor or aroma of coffee beans, comprising modulating production or activity of one or more invertases or invertase inhibitor within coffee seeds.
 37. The method of claim 36, comprising increasing production or activity of the one or more invertases or invertase inhibitors.
 38. (canceled)
 39. (canceled)
 40. The method of claim 37, comprising increasing production or activity of one or more invertase inhibitors.
 41. The method of claim 40, wherein endogenous invertase activity in the plant is decreased as compared with an equivalent plant in which production or activity of the invertase inhibitor is not increased.
 42. The method of claim 40, wherein the plant comprises more sucrose in its seeds than does an equivalent plant in which production or activity of the invertase inhibitor is not increased.
 43. The method of claim 36, comprising decreasing production or activity of the one or more invertase or invertase inhibitors.
 44. (canceled)
 45. The method of claim 44, wherein expression or activity of an invertase is decreased.
 46. The method of claim 45, wherein the plant comprises more sucrose in its seeds than does an equivalent plant in which production or activity of the invertase is not decreased. 